Quantizing the model with the TFLite converter
Exporting the trained network as SavedModel
saves the training graphs such as the network architecture, weights, training variables, and checkpoints. Therefore, the generated TF model is perfect for sharing or resuming a training session but not suitable for microcontroller deployment for the following reasons:
- The weights are stored in floating-point format.
- The model keeps information that's not required for the inference.
Since our target device has computational and memory constraints, it is crucial to transform the trained model into something compact.
This recipe will teach how to quantize and convert the trained model into a lightweight, memory-efficient, and easy-to-parse exporting format with TensorFlow Lite (TFLite). The generated model will then be converted to a C-byte array, suitable for microcontroller deployments.
The following Colab file (see the Quantizing the model with TFLite converter section...