Regularization – a multi-dimensional problem
Having the right diagnosis for a model is crucial, as it allows us to choose the strategy more carefully to improve the model. But from any diagnosis, many paths are possible to improve the model. Those paths can be separated into three main categories, as proposed in the following figure:
Figure 1.17 – A proposed categorization of regularization types: data, model architecture, and model training
At the data level, we may have the following tools for regularization:
- Adding more data, either synthetic or real
- Adding more features
- Feature engineering
- Data preprocessing
Indeed, the data is of extreme importance in ML in general, and regularization is no exception. We will see many examples throughout the book of regularizing data.
At the model level, the following methods may be used for regularization:
- Choosing a more or less simple architecture
- In deep...