Finding optimal training parameters using grid search
When you are working with classifiers, you do not always know what the best parameters are. You cannot brute-force it by checking for all possible combinations manually. This is where grid search becomes useful. Grid search allows us to specify a range of values and the classifier will automatically run various configurations to figure out the best combination of parameters. Let's see how to do it.
Create a new Python file and import the following packages:
import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import classification_report from sklearn import cross_validation, grid_search from sklearn.ensemble import ExtraTreesClassifier from sklearn import cross_validation from sklearn.metrics import classification_report from utilities import visualize_classifier
We will use the data available in data_random_forests.txt
for analysis:
# Load input data input_file...