Encoding categorical features
There are several reasons why we might need to encode features before using them in most machine learning algorithms. First, these algorithms typically require numeric data. Second, when a categorical feature is represented with numbers, for example, 1 for female and 2 for male, we need to encode the values so that they are recognized as categorical. Third, the feature might actually be ordinal, with a discrete number of values that represent some meaningful ranking. Our models need to capture that ranking. Finally, a categorical feature might have a large number of values (known as high cardinality), and we might want our encoding to collapse categories.
We can handle the encoding of features with a limited number of values, say 15 or less, with one-hot encoding. In this section, we will, first, go over one-hot encoding and then discuss ordinal encoding. We will look at strategies for handling categorical features with high cardinality in the next...