You're reading from Learn TensorFlow Enterprise Build, manage, and scale machine learning workloads seamlessly using Google's TensorFlow Enterprise

Product type Paperback

Published in Nov 2020

Publisher Packt

ISBN-13 9781800209145

Length 314 pages

Edition 1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Artificial Intelligence

Author (1):

KC Tung

View More author details

Table of Contents (15) Chapters

Preface

1. Section 1 – TensorFlow Enterprise Services and Features

2. Chapter 1: Overview of TensorFlow Enterprise FREE CHAPTER

3. Chapter 2: Running TensorFlow Enterprise in Google AI Platform

4. Section 2 – Data Preprocessing and Modeling

5. Chapter 3: Data Preparation and Manipulation Techniques

6. Chapter 4: Reusable Models and Scalable Data Pipelines

7. Section 3 – Scaling and Tuning ML Works

8. Chapter 5: Training at Scale

9. Chapter 6: Hyperparameter Tuning

10. Section 4 – Model Optimization and Deployment

11. Chapter 7: Model Optimization

12. Chapter 8: Best Practices for Model Training and Performance

13. Chapter 9: Serving a TensorFlow Model

14. Other Books You May Enjoy

Leave a review - let other readers know what you think

Summary

In this chapter, we learned to optimize a trained model by making it smaller and therefore more compact. Therefore, we have more flexibility when it comes to deploying these models in various hardware or resource constrained conditions. Optimization is important for model deployment in a resource constrained environment such as edge devices with limited compute, memory, or power resources. We achieved model optimization by means of quantization, where we reduced the model footprint by altering the weight, biases, and activation levels' data type.

We learned about three quantization strategies: reduced float16, hybrid quantization, and integer quantization. Of these three strategies, integer quantization currently requires an upgrade to TensorFlow 2.3.

Choosing a quantization strategy depends on factors such as target compute, resource, model size limit, and model accuracy. Furthermore, you have to consider whether or not the target hardware requires integer ops...