Defining classification
In this chapter, you will discover classification. Classification is a supervised machine learning task in which a model is constructed that assigns observations to a category.
The simplest types of classification models that everybody tends to know are decision trees. Let's consider a super simple example of how a decision tree could be used for classification.
Imagine that we have a dataset in which we have observations about five humans and five animals. The goal is to use this data to build a decision tree that can be used on any new, unseen animal or human.
The data can be imported as follows:
Code Block 6-1
import pandas as pd
# example to classify human vs animal
#dataset with one variable
can_speak = [True,True,True,True,True,True,True,False,False,False]
has_feathers = [False,False,False,False,False,True,True,False,False,False]
is_human = [True,True,True,True,True,False,False,False,False,False]
data = pd.DataFrame...