Mission accomplished
The first part of the mission was to understand risk factors for cardiovascular disease, and you've determined that the top four risk factors are systolic blood pressure (ap_hi
), age
, cholesterol
, and weight
according to the logistic regression model, of which only age
is non-modifiable. However, you also realized that systolic blood pressure (ap_hi
) is not as meaningful on its own since it relies on diastolic blood pressure (ap_lo
) for interpretation. The same goes for weight
and height
. We learned that the interaction of features plays a crucial role in interpretation, and so does their relationship with each other and the target variable, whether linear or monotonic. Furthermore, the data is only a representation of the truth, which can be wrong. After all, we found anomalies that, left unchecked, can bias our model.
Another source of bias is how the data was collected. After all, you can wonder why the model's top features were all objective and...