Any new data can be passed to the model to get the results. This process of getting the classification results or features from an image is termed as inference. Training and inference usually happen on different computers and at different times. We will learn about storing the model, running the inference, and using TensorFlow Serving as the server with good latency and throughput.
Model inference
Exporting a model
The model after training has to be exported and saved. The weights, biases, and the graph are stored for inference. We will train an MNIST model and store it. Start with defining the constants that are required, using the following code:
work_dir = '/tmp'
model_version = 9
training_iteration = 1000
input_size...