Implementing a decision tree with scikit-learn
Now, when we are sufficiently aware of the mathematics behind decision trees, let us implement a simple decision tree using the methods in scikit-learn
. The dataset we will be using for this is a commonly available dataset called the iris
dataset that has information about flower species and their petal and sepal dimensions. The purpose of this exercise will be to create a classifier that can classify a flower as belonging to a certain species based on the flower petal and sepal dimensions.
To do this, let's first import the dataset and have a look at it:
import pandas as pd data=pd.read_csv('E:/Personal/Learning/Predictive Modeling Book/My Work/Chapter 7/iris.csv') data.head()
The datasheet looks as follows:
Sepal-length, Sepal-width, Petal-length, and Petal-width are the dimensions of the flower while the Species denotes the class the flower belongs to. There are actually...