Chapter 10: Advanced ML Engineering
Congratulations on making it so far. By now, you should have developed a good understanding of the core fundamental skills that a machine learning (ML) solutions architect needs to work effectively across different phases of the ML life cycle. In this chapter, we will dive deep into several advanced ML topics. Specifically, we will cover various distributed model training options for large models and large datasets. We will also discuss the various technical approaches for reducing model inference latency. We will close this chapter with a hands-on lab on distributed model training.
Specifically, we will cover the following topics in this chapter:
- Training large-scale models with distributed training
- Achieving low latency model inference
- Hands-on lab – running distributed model training with PyTorch