Assessing causation in machine learning models
Calculating the correlation between features and outcomes in machine learning modeling has been a common approach in many fields and industries. For example, we can simply calculate the Pearson correlation coefficient to identify correlative features with the target variable. There are also features in many of our machine learning models that contribute to the prediction of outcomes not as causal but rather as correlative predictors. There are several ways to differentiate between such correlative and causal features with the available functionalities in Python. Here are a few examples:
- Experimental design: One way to establish causality is to conduct experiments where we measure the effect of changes in the causal feature on the target variable. However, such experimental studies may not always be feasible or ethical.
- Feature importance: We can use explainability techniques, as presented in Chapter 6, Interpretability and...