In one-hot encoding, we represent a categorical variable as a group of binary variables, where each binary variable represents one category. The binary variable indicates whether the category is present in an observation (1) or not (0). The following table shows the one-hot encoded representation of the Gender variable with the categories of Male and Female:
Gender | Female | Male |
Female | 1 | 0 |
Male | 0 | 1 |
Male | 0 | 1 |
Female | 1 | 0 |
Female | 1 | 0 |
Â
As shown in the table, from the Gender variable, we can derive the binary variable of Female, which shows the value of 1 for females, or the binary variable of Male, which takes the value of 1 for the males in the dataset.
For the categorical variable of Color with the values of red, blue, and green, we can create three variables called...