Predicting breast cancer outcomes using Random Forests
We are now going to predict the outcomes for some patients using Random Forests. A random forest is an ensemble method (it will use several instances of other machine learning algorithms) that uses many decision trees to arrive at robust conclusions about the data. We are going to use the same example as in the previous recipe: breast cancer traits and outcomes.
This recipe has two main goals: to introduce you to random forests and issues regarding the training of machine learning algorithms.
Getting ready
The code for this recipe can be found in Chapter10/Random_Forest.py
.
How to do it…
Take a look at the code:
- We start, as in the previous recipe, by getting rid of samples with missing information:
import pandas as pd import numpy as np import pandas as pd from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.tree import export_graphviz...