Evaluating potential models using MSE and R2 scores
There will always be a large number of potential models that you can attempt to train, and you can spend a large amount of time tweaking each of them to optimize them. It's valuable to understand which ones could give you the best outcome before you spend a large amount of time on any option. We're going to use k-fold validation to check how we trained the model. This will take our training data and create k sections. You can think of this as folding a piece of paper k times, and then taking turns using one of the k sections as the testing data, and the rest as the training data:
- First, we want to import what we need for this exercise. The next bit of code will do the training so we can see which model would be a nice fit. We'll start as usual by importing what we need:
from sklearn.model_selection import cross_val_score from sklearn.model_selection import StratifiedKFold from sklearn.linear_model import ...