Summary
In this chapter, we mainly discussed adaptive model training and elastic model serving. At a high level, we can adjust the number of workers/GPUs to use in the middle of our model training or serving session.
After reading this chapter, you should understand how adaptive DNN training works in both data parallelism and model parallelism. You should also be able to implement adaptive model training using the adaptdl
library. You should know how elastic model serving works and how to use an AWS serverless computing environment for computation resource requests.
In the next chapter, we will discuss more advanced techniques for further DNN training and serving speed-ups.