Detecting post-training bias with SageMaker Clarify
In the previous recipe, we used SageMaker Clarify to help us detect pre-training bias in our data. In this recipe, we will use SageMaker Clarify to detect post-training bias in the same dataset we used in the previous recipe. In addition to this, we will train a model using this dataset and use it to compute the post-training bias metrics. Specifically, we will compute the Difference in Positive Proportions in Predicted Labels (DPPL) and Recall Difference (RD) metric values and check the results after the processing job has finished running.
Note
Why is this important? If the metric value for DPPL suggests bias against a disadvantaged group, this means that the machine learning model has a higher chance of predicting positive outcomes for the advantaged group. For example, if the advantaged group involves male applicants and the disadvantaged group involves female applicants, a machine learning model may accept more scholarship...