One-hot encoding represents each category of a categorical variable with a binary variable. Hence, one-hot encoding of highly cardinal variables or datasets with multiple categorical features can expand the feature space dramatically. To reduce the number of binary variables, we can perform one-hot encoding of the most frequent categories only. One-hot encoding of top categories is equivalent to treating the remaining, less frequent categories as a single, unique category, which we will discuss in the Grouping rare or infrequent categories recipe toward the end of this chapter.
For more details on variable cardinality and frequency, visit the Determining cardinality in categorical variables recipe and the Pinpointing rare categories in categorical variables recipe in Chapter 1, Foreseeing Variable Problems...