Finding the optimal hyperparameters through grid search
Finding the best hyperparameters (called this because they influence the parameters learned during the training phase) is not always easy and there are seldom good methods to start from. The personal experience (a fundamental element) must be aided by an efficient tool such as GridSearchCV
, which automates the training process of different models and provides the user with optimal values using cross-validation.
As an example, we show how to use it to find the best penalty and strength factors for a linear regression with the Iris toy dataset:
import multiprocessing from sklearn.datasets import load_iris from sklearn.model_selection import GridSearchCV >>> iris = load_iris() >>> param_grid = [ { 'penalty': [ 'l1', 'l2' ], 'C': [ 0.5, 1.0, 1.5, 1.8, 2.0, 2.5] } ] >>> gs = GridSearchCV(estimator=LogisticRegression(), param_grid=param_grid, scoring='accuracy', cv=10, n_jobs=multiprocessing...