Categorical variables are a problem. On one hand they provide valuable information; on the other hand, it's probably text—either the actual text or integers corresponding to the text—such as an index in a lookup table.
So, we clearly need to represent our text as integers for the model's sake, but we can't just use the id field or naively represent them. This is because we need to avoid a similar problem to the Creating binary features through thresholding recipe. If we treat data that is continuous, it must be interpreted as continuous.