The approach
You have decided to do the following:
- Train proxy models: You don't have the original features or model, but all is not lost because you have the COMPAS scores – the labels. And we also have relevant features to the problem we can connect to these labels with models. By approximating the COMPAS model via the proxies, you can assess its unfairness of the labels. In this chapter, we will train a CatBoost model and a neural network model.
- Anchor explanations: Using this method will unearth insights into why the proxy model makes specific predictions using a series of rules called anchors, which tell you where the decision boundaries lie. The boundaries are relevant for our mission because we want to know why the defendant has been wrongfully predicted to recidivate. It's an approximate boundary to the original model, but there's still some truth to it.
- Counterfactual explanations: The opposite concept to anchors is about understanding...