Discovering the counterfactual explanation strategy
Counterfactual explanation or reasoning is a method of understanding and explaining anything in general by considering alternative and counterfactual scenarios or “what-if” situations. In the context of prediction explanations, it involves identifying changes in the input data that would lead to a different outcome. Ideally, the minimal changes should be identified. In the context of NN interpretation, it involves visualizing the opposite of the target label or intermediate latent features. This approach makes sense to use because it closely aligns with how humans naturally explain events and assess causality, which ultimately allows us to comprehend the underlying decision-making process of the model better.
Humans tend to think in terms of cause and effect, and we often explore alternative possibilities to make sense of events or decisions. For example, when trying to understand why a certain decision was made,...