You're reading from Deep Learning with TensorFlow 2 and Keras Regression, ConvNets, GANs, RNNs, NLP, and more with TensorFlow 2 and the Keras API

Product type Paperback

Published in Dec 2019

Publisher Packt

ISBN-13 9781838823412

Length 646 pages

Edition 2nd Edition

Languages

Python

Tools

Keras

Concepts

Deep Learning

Authors (3):

Dr. Amita Kapoor

Sujit Pal

Antonio Gulli

View More author details

Table of Contents (19) Chapters

Preface

1. Neural Network Foundations with TensorFlow 2.0 FREE CHAPTER

2. TensorFlow 1.x and 2.x

3. Regression

4. Convolutional Neural Networks

5. Advanced Convolutional Neural Networks

6. Generative Adversarial Networks

7. Word Embeddings

8. Recurrent Neural Networks

9. Autoencoders

10. Unsupervised Learning

11. Reinforcement Learning

12. TensorFlow and Cloud

13. TensorFlow for Mobile and IoT and TensorFlow.js

14. An introduction to AutoML

15. The Math Behind Deep Learning

16. Tensor Processing Unit

17. Other Books You May Enjoy

18. Index

TensorFlow Extended for production

TFX is an end-to-end platform for deploying machine learning pipelines. A part of the TensorFlow ecosystem, it provides a configuration framework and shared libraries so as to integrate the common components needed to define, launch, and monitor software based on ML models. TFX includes many of the requirements for production software deployments and best practices, viz: scalability, consistency, testability, safety and security, and so on.

It starts with ingesting your data, followed by data validation, feature engineering, training, and serving. Google has created libraries for each major phase of the pipeline, and there are frameworks for a wide range of deployment targets. TFX implements a series of ML pipeline components. All of this is made possible by creating horizontal layers for things like pipeline storage, configuration, and orchestration. These layers are very important for managing and optimizing the pipelines and the applications that you run on them.

You will need to install it first. TensorFlow Extended can be installed using the pip command:

pip install tfx

In the following section we will cover the fundamentals of TFX, its architecture, and the various libraries available within it.

TFX Pipelines

The TFX pipeline consists of a sequence of components that implement an ML pipeline, specifically, ensuring the scalability and high performance of the underlined ML task. It includes modeling, training, inference, and deployment to web or mobile targets. A TFX pipeline includes several components, with each component consisting of three main elements: Driver, Executor, and and the Publisher. The driver queries the metadata store and supplies the resultant metadata to the executor, publisher accepts the results of the executor and saves then in metadata. The executor is the one performing all the processing. As an ML software developer, you will need to write code that runs in the executor depending upon the component class you are working with:

In a TFX pipeline, a unit of data, called an artifact, is passed between components. Normally a component has one input artifact and one output artifact. Every artifact has an associated metadata that defines its type and properties. The artifact type defines the ontology of artifacts in the entire TFX system, while the artifact property specifies the ontology specific to an artifact type. Users have the option to extend the ontology globally or locally.

TFX pipeline components

The following diagram shows the flow of data between different TFX components:

Flow of data between TFX components

All the images in the TFX section have been adapted from the TensorFlow Extended official guide: https://www.tensorflow.org/tfx/guide.

To begin with we have ExampleGen, which ingests the input data, and can also split the input dataset. The data then flows to StatisticsGen, which calculates the statistics of the dataset. Then comes SchemaGen, which examines the statistics and creates a data schema; then an ExampleValidator, which looks for anomalies and missing values in the data; and Transform, which performs feature engineering in the dataset. The transformed dataset is then fed to the Trainer, which trains the model. The performance of the model is evaluated using Evaluator and ModelValidator. Finally, if all is well, the Pusher deploys the model on the serving infrastructure.

TFX libraries

TFX provides several Python packages that are used to create pipeline components. Quoting from the TensorFlow Extended User Guide (https://www.tensorflow.org/tfx/guide).

These packages are the libraries which you will use to create the components of your pipelines so that your code can focus on the unique aspects of your pipeline.

Different libraries included in TFX are:

TensorFlow Data Validation (TFDV) is a library for analyzing and validating machine learning data
TensorFlow Transform (TFT) is a library for preprocessing data with TensorFlow
TensorFlow is used for training models with TFX
TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models
TensorFlow Metadata (TFMD) provides standard representations for metadata that are useful when training machine learning models with TensorFlow
ML Metadata (MLMD) is a library for recording and retrieving metadata associated with ML developers and data scientists' workflows

The following diagram demonstrates the relationship between TFX libraries and pipeline components:

Figure 7: Relationships between TFX libraries and pipeline components, visualized

TFX uses the open source Apache Beam to implement data-parallel pipelines. Optionally TFX allows Apache Airflow and Kubeflow for easy configuration, operation, monitoring, and maintenance of the ML pipeline. Once the model is developed and trained, using TFX you can deploy it to one or more deployment target(s) where it will receive inference requests. TFX supports deployment to three classes of deployment targets: TensorFlow Serving (works with REST or gRPC interface), TensorFlow.js (for browser applications), and TensorFlow Lite (for native mobile and IoT applications). Trained models that have been exported as SavedModels can be deployed to any or all of these deployment targets.