Regression with a non-binary predictor
Back in a previous section, I promised that the same dummy-coding method that we used to regress binary categorical variables could be adapted to handle categorical variables with more than two values. For an example of this, we are going to use the same WeightLoss
dataset as we did in to illustrate ANOVA.
To review, the WeightLoss
dataset contains pounds lost and self-esteem measurements for three weeks for three different groups: a control group, one group just on a diet, and one group that dieted and exercised. We will be trying to predict the amount of weight lost in week 2 by the group the participant was in.
Instead of just having one dummy-coded predictor, we now need two. Specifically:
![](https://static.packt-cdn.com/products/9781785288142/graphics/B04324_08_51.jpg)
Consequently, the equations describing our predictive model are:
![](https://static.packt-cdn.com/products/9781785288142/graphics/B04324_08_52.jpg)
Meaning that the is the mean of weight lost in the control group,
is the difference in the weight lost between control and diet only group, and
is the difference in the weight lost between the control...