Implementing random forest
Let’s try to improve our heart disease model with a random forest:
- First, let’s load the same libraries that we used in the previous section, except we will import the random forest classifier this time:
import pandas as pd import numpy as np from imblearn.pipeline import make_pipeline from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import RandomizedSearchCV from scipy.stats import randint import sklearn.metrics as skmet import os import sys sys.path.append(os.getcwd() + "/helperfunctions") import healthinfo as hi
We also load the healthinfo
module; it loads the health information data and does our preprocessing. There is nothing fancy here. The preprocessing code we stepped through earlier was just copied to the helperfunctions
subfolder of the current working directory.
- Now, let’s grab the data that’s been processed by the
healthinfo
module so that we can use it...