Evaluating classification algorithm performance with metrics
There are several metrics available for comparing classification ML algorithms, each with their strengths and weaknesses. We'll look at some of the more common metrics for classification here.
Train-validation-test splits
When evaluating the performance of an algorithm, it's important to look at performance on data that was not used for training. The model has already learned all the ins and outs of the training data, and may have even overfit to the training data, learning patterns related to noise in the data. Instead, we want to evaluate performance on a hold-out set, which we may call a validation or test set. For some algorithms, such as neural networks, we train the model on a training set, monitor performance during training on a validation set, and then evaluate final performance on a test set. For most other ML algorithms, we simply use a training and test set If our test or validation sets contain...