Using TensorFlow Serving
In this section, we will show you how to serve machine learning models in production. We will use the TensorFlow Serving components of the TensorFlow Extended (TFX) platform. TFX is an MLOps tool that builds complete, end-to-end machine learning pipelines for scalable and high-performance model tasks. A TFX pipeline is composed of a sequence of components for data validation, data transformation, model analysis, and model serving. In this recipe, we will focus on the last component, which can support model versioning, multiple models, and so on.
Getting ready
We'll start this section by encouraging you to read through the official documentation and the short tutorials on the TFX site, available at https://www.tensorflow.org/tfx.
For this example, we will build an MNIST model, save it, download the TensorFlow Serving Docker image, run it, and send POST requests to the REST server in order to get some image predictions.
...