Label encoding
When we perform classification, we usually deal with a lot of labels. These labels can be in the form of words, numbers, or something else. The machine learning functions in sklearn expect them to be numbers. So if they are already numbers, then we can use them directly to start training. But this is not usually the case.
In the real world, labels are in the form of words, because words are human readable. We label our training data with words so that the mapping can be tracked. To convert word labels into numbers, we need to use a label encoder. Label encoding refers to the process of transforming the word labels into numerical form. This enables the algorithms to operate on our data.
Create a new Python file and import the following packages:
import numpy as np from sklearn import preprocessing
Define some sample labels:
# Sample input labels input_labels = ['red', 'black', 'red', 'green', 'black', 'yellow', 'white']
Create the label encoder object and...