6.6 Categorical predictors
A categorical variable represents distinct groups or categories that can take on a limited set of values from those categories. These values are typically labels or names that don’t possess numerical significance on their own. Some examples are:
Political affiliation: conservative, liberal, or progressive.
Sex: female or male.
Customer satisfaction level: very unsatisfied, unsatisfied, neutral, satisfied, or very satisfied.
Linear regression models can easily accommodate categorical variables; we just need to encode the categories as numbers. There are a few options to do so. Bambi can easily handle the details for us. The devil is in the interpretation of the results, as we will explore in the next two sections.
6.6.1 Categorical penguins
For the current example, we are going to use the palmerpenguins dataset, Horst et al. [2020], which contains 344 observations of 8 variables. For the moment, we are interested in modeling the mass of the...