Finding the Right Hyperparameters
In this chapter, you’ll dive into the key hyperparameters that govern performance for top vision and language models, such as batch size, learning rate, and more. First, we’ll start with a quick overview of hyperparameter tuning for those who are new or need a light refresh, including key examples in vision and language. Then, we’ll explore hyperparameter tuning in foundation models, both what is possible today and where trends might emerge. Finally, we’ll learn how to do this on Amazon SageMaker, taking incremental steps in a cluster size and changing each hyperparameter as we do. In this chapter, we’re going to cover the following main topics:
- Hyperparameters – batch size, learning rate, and more
- Tuning strategies
- Tuning for foundation models
- Scaling up as a function of world size with SageMaker