We will continue to explore the iris dataset further by focusing on the first two features (sepal length and sepal width), optimizing the decision tree, and creating some visualizations.
Tuning a decision tree
Getting ready
- Load the iris dataset, focusing on the first two features. Additionally, split the data into training and testing sets:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data[:,:2]
y = iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y)
- View the data with pandas:
import pandas as pd
pd.DataFrame(X,columns=iris.feature_names[:2])
- Before optimizing the decision tree, let's try a single decision...