Chapter 6: Running Hyperparameter Tuning at Scale
Hyperparameter tuning or hyperparameter optimization (HPO) is a procedure that finds the best possible deep neural network structures, types of pretrained models, and model training process within a reasonable computing resource constraint and time frame. Here, hyperparameter refers to parameters that cannot be changed or learned during the ML training process, such as the number of layers inside a deep neural network, the choice of a pretrained language model, or the learning rate, batch size, and optimizer of the training process. In this chapter, we will use HPO as a shorthand to refer to the process of hyperparameter tuning and optimization. HPO is a critical step for producing a high-performance ML/DL model. Given that the search space of the hyperparameter is very large, efficiently running HPO at scale is a major challenge. The complexity and high cost of evaluating a DL model, compared to classical ML models, further compound...