Optimizing your script for SageMaker training
So far in this book, you have learned quite a lot! We have covered everything from the foundations of pretraining to GPU optimization, picking the right use case, dataset and model preparation, parallelization basics, finding the right hyperparameters, and so on. The vast majority of this is that these are applicable in any compute environment you choose to apply them to. This chapter, however, is exclusively scoped to AWS and SageMaker especially. Why? So that you can master all the nuances included in at least one compute platform. Once you have learned how to become proficient in one compute platform, then you will be able to use that to work on any project you like! When, for various reasons, you need to transition onto another platform, you will at least have the basic concepts you need to know about to look for and consider the transition.
First, let us look at your scripts. The core of most SageMaker training scripts has at least...