Finding optimal hyperparameters with GridSearchCV
As we have created new models and tried various data processing techniques, we have used many different parameters and function arguments to determine how we set up the problem. One example is the impute
method. Mean, median, or some other advanced approach – how do we know which we should take? One naïve approach might be to simply create a for
loop and try every technique. We can calculate the score for each and use the best one. We tried a similar approach before when looking at which algorithm would give us the best score in the previous section.
This might be naïve, but never overlook the simple. It is such a good approach that scikit-learn
decided to package that together and make an easy method to do so. It will even perform a k-fold cross-validation to make sure it is getting the best solution. There are a few different ways to tune hyperparameters, but we're going to focus on a grid search.
A grid...