Introducing ggplot2
Conveying information via graphs tends to be more effective and visually appealing than tables alone. After all, humans are much quicker at processing visual information, such as recognizing a car in an image. In building machine learning (ML) models, we are often interested in the training and test loss profile in the form of a line chart that indicates the reduction in the training and test set loss as the model gets trained for a more extended period. Observing performance metrics helps us better diagnose whether a model is underfitting or overfitting—in other words, whether the current model is too simple or overly complex. Note that the test set is used to approximate a future dataset, and minimizing the test set error helps the model generalize to new datasets, an approach known as empirical risk minimization. Underfitting refers to the case when the model does poorly in both training and test sets due to insufficient fitting power, while overfitting...