Model Preparation
In this chapter, you’ll learn how to decide which model will be most useful to serve as a basis for your pretraining regime. You’ll learn how to think about the size of the model in parameters, along with the key loss functions and how they determine performance in production. Finally, you’ll combine the scaling laws with the expected size of your dataset to select ceiling and floor model sizes that you’ll use to guide your experiments.
In this chapter, we will cover the following topics:
- Finding your best base model
- Finding your pretraining loss function
- Solving for your model size
- Planning future experiments