In this chapter, the ultimate goal of any real-life machine learning application will be presented—the deployment and inference of a trained model. As we saw in the previous chapters, TensorFlow allows us to train models and save their parameters in checkpoint files, making it possible to restore the model's status and continue with the training process, while also running the inference from Python.
The checkpoint files, however, are not in the right file format when the goal is to use a trained machine learning model with low latency and a low memory footprint. In fact, the checkpoint files only contain the models' parameters value, without any description of the computation; this forces the program to define the model structure first and then restore the model parameters. Moreover, the checkpoint files contain variable values that...