Dealing with categorical features
Data transformation methods for categorical features will vary according to the sub-type of your variable. In the upcoming sections, you will understand how to transform nominal and ordinal features.
Transforming nominal features
You may have to create numerical representations of your categorical features before applying ML algorithms to them. Some libraries may have embedded logic to handle that transformation for you, but most of them do not.
The first transformation you will learn is known as label encoding. A label encoder is suitable for categorical/nominal variables, and it will just associate a number with each distinct label of your variables. Table 4.2 shows how a label encoder works:
Country |
Label encoding |
India |
1 |
Canada |
2 |
... |