Model inference
Any new data can be passed to the model to get the results. This process of getting the classification results or features from an image is termed as inference. Training and inference usually happen on different computers and at different times. We will learn about storing the model, running the inference, and using TensorFlow Serving as the server with good latency and throughput.
Exporting a model
The model after training has to be exported and saved. The weights, biases, and the graph are stored for inference. We will train an MNIST model and store it. Start with defining the constants that are required, using the following code:
work_dir = '/tmp' model_version = 9 training_iteration = 1000 input_size = 784 no_classes = 10
batch_size = 100 total_batches = 200
The model_version
can be an integer to specify which model we want to export for serving. The feature config
is stored as a dictionary with placeholder names and their corresponding datatype. The prediction classes and...