Evaluating the accuracy using cross-validation
The cross-validation is an important concept in machine learning. In the previous recipe, we split the data into training and testing datasets. However, in order to make it more robust, we need to repeat this process with different subsets. If we just fine-tune it for a particular subset, we may end up overfitting the model. Overfitting refers to a situation where we fine-tune a model too much to a dataset and it fails to perform well on unknown data. We want our machine learning model to perform well on unknown data.
Getting ready…
Before we discuss how to perform cross-validation, let's talk about performance metrics. When we are dealing with machine learning models, we usually care about three things: precision, recall, and F1 score. We can get the required performance metric using the parameter scoring. Precision refers to the number of items that are correctly classified as a percentage of the overall number of items in the list. Recall...