Options for improving model performance
The changes we can make to improve the performance of our models could be related to the algorithms we use or the data we feed them to train our models (see Table 5.1). Adding more data points could reduce the variance of the model, for example, by adding data close to the decision boundaries of classification models to increase confidence in the identified boundaries and reduce overfitting. Removing outliers could reduce both bias and variance by eliminating the effect of distant data points. Adding more features could help the model to become better at the training stage (that is, lower model bias), but it might result in higher variance. There could also be features that cause overfitting and their removal could help to increase model generalizability.
Change |
Potential effect |
Description |
Adding more training data... |