Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
<div><p>Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.</p><p>Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.</p><p>You’ll learn how to:</p><ul><li>Automate and schedule data ingest, using an App Engine application</li><li>Create and populate a dashboard in Google Data Studio</li><li>Build a real-time analysis pipeline to carry out streaming analytics</li><li>Conduct interactive data exploration with Google BigQuery</li><li>Create a Bayesian model on a Cloud Dataproc cluster</li><li>Build a logistic regression machine-learning model with Spark</li><li>Compute time-aggregate features with a Cloud Dataflow pipeline</li><li>Create a high-performing prediction model with TensorFlow</li><li>Use your deployed model as a microservice you can access from both batch and real-time pipelines</li></ul></div>