Exploring the different model evaluation methods
Most practitioners are familiar with accuracy-related metrics. This is the most basic evaluation method. Typically, for supervised problems, a practitioner will treat an accuracy-related metric as the golden source of truth. In the context of model evaluation, the term “accuracy metrics” is often used to collectively refer to various performance metrics such as accuracy, F1 score, recall, precision, and mean squared error. When coupled with a suitable cross-validation partitioning strategy, using metrics as a standalone evaluation strategy can go a long way in most projects. In deep learning, accuracy-related metrics are typically used to monitor the progress of the model at each epoch. The monitoring process can subsequently be extended to perform early stopping to stop training the model when it doesn’t improve anymore and to determine when to reduce the learning rate. Additionally, the best model weights can be...