Exploring counterfactual explanations
Counterfactuals are an integral part of human reasoning. How many of us have muttered the words “If I had done X instead, my outcome y would have been different”? There’s always one or two things that, if done differently, could lead to the outcomes we prefer!
In machine learning outcomes, you can leverage this way of reasoning to make for extremely human-friendly explanations where we can explain decisions in terms of what would need to change to get the opposite outcome (the counterfactual class). After all, we are often interested in knowing how to make a negative outcome better. For instance, how do you get your denied loan application approved or decrease your risk of cardiovascular disease from high to low? However, hopefully, answers to those questions aren’t a huge list of changes. You prefer the smallest number of changes required to change your outcome.
Regarding fairness, counterfactuals are an important...