Decision tree and random forest regression
We will use a decision tree and a random forest in this section to build a regression model with the same income gap data we worked with earlier in this chapter. We will also use tuning to identify the hyperparameters that give us the best-performing model, just as we did with KNN regression. Let’s get started:
- We must load many of the same libraries as we did with KNN regression, plus
DecisionTreeRegressor
andRandomForestRegressor
from scikit-learn:import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.impute import SimpleImputer from sklearn.pipeline import make_pipeline from sklearn.model_selection import RandomizedSearchCV from sklearn.tree import DecisionTreeRegressor, plot_tree from sklearn.ensemble import RandomForestRegressor from sklearn.linear_model import LinearRegression from sklearn.feature_selection import SelectFromModel
- We must also import our class for handling...