You're reading from Journey to Become a Google Cloud Machine Learning Engineer Build the mind and hand of a Google Certified ML professional

Product type Paperback

Published in Sep 2022

Publisher Packt

ISBN-13 9781803233727

Length 330 pages

Edition 1st Edition

Languages

Python

Tools

BigQuery

Concepts

Machine Learning

Author (1):

Dr. Logan Song

View More author details

Table of Contents (23) Chapters

Preface

1. Part 1: Starting with GCP and Python

2. Chapter 1: Comprehending Google Cloud Services FREE CHAPTER

3. Chapter 2: Mastering Python Programming

4. Part 2: Introducing Machine Learning

5. Chapter 3: Preparing for ML Development

6. Chapter 4: Developing and Deploying ML Models

7. Chapter 5: Understanding Neural Networks and Deep Learning

8. Part 3: Mastering ML in GCP

9. Chapter 6: Learning BQ/BQML, TensorFlow, and Keras

10. Chapter 7: Exploring Google Cloud Vertex AI

11. Chapter 8: Discovering Google Cloud ML API

12. Chapter 9: Using Google Cloud ML Best Practices

13. Part 4: Accomplishing GCP ML Certification

14. Chapter 10: Achieving the GCP ML Certification

15. Part 5: Appendices

16. Index

Why subscribe?

17. Other Books You May Enjoy

Appendix 1: Practicing with Basic GCP Services

1. Appendix 2: Practicing Using the Python Data Libraries

2. Appendix 3: Practicing with Scikit-Learn

3. Appendix 4: Practicing with Google Vertex AI

4. Appendix 5: Practicing with Google Cloud ML API

ML data storage and processing

As we discussed in Chapter 4, Developing and Deploying ML Models, storing data involves collecting raw data from various data sources and storing it in a centralized repository. On the other hand, data processing includes both data engineering and feature engineering. Data engineering is the process of converting raw data (the data in its source form) into prepared data (the dataset in the form that is ready to be input into ML tasks). Feature engineering then tunes the prepared data to create the features expected by the ML model.

For structured data, we recommend using Google Cloud BQ to store and process it. For unstructured data, videos, audio, and image data, we recommend using Google Cloud object storage to store them and Google Cloud Dataflow or Dataproc to process them. As we have discussed, Dataflow is a managed service that uses the Apache Beam programming model to convert unstructured data into binary formats and can improve data ingestion...