Summary
In this chapter, you learned about the basics of serving Transformer models using FastAPI. You also learned how to serve models in a more advanced and efficient way, such as by using TFX. You then studied the basics of load testing and creating users. Making these users spawn in groups or one by one, and then reporting the results of stress testing, was another major topic of this chapter. After that, you studied the basics of Docker and learned how to package your application in the form of a Docker container. Finally, you learned how to serve Transformer-based models.
In the next chapter, you will learn about Transformer deconstruction, the model view, and monitoring training using various tools and techniques.