Turns out that it's easy to make decision trees; in fact it's crazy just how easy it is, with just a few lines of Python code. So let's give it a try.
I've included a PastHires.csv file with your book materials, and that just includes some fabricated data, that I made up, about people that either got a job offer or not based on the attributes of those candidates.
import numpy as np import pandas as pd from sklearn import tree input_file = "c:/spark/DataScience/PastHires.csv" df = pd.read_csv(input_file, header = 0)
You'll want to please immediately change that path I used here for my own system (c:/spark/DataScience/PastHires.csv) to wherever you have installed the materials for this book. I'm not sure where you put it, but it's almost certainly not there.
We will use...