Understanding bias, fairness in ML, and ML explainability
There are two types of bias in ML that we can analyze and mitigate to ensure fairness—data bias and model bias. Data bias is an imbalance in the training data across different groups and categories that can be introduced into an ML solution simply due to a sampling error, or intricately due to inherent reasons that are unfortunately ingrained in society. Data bias, if neglected, can translate into poor accuracy in general and unfair prediction against a certain group in a trained model. It is more critical than ever to be able to discover inherent biases in the data early and take action to address them. Model bias, on the other hand, refers to bias introduced by model prediction, such as the distribution of classification and errors among advantaged and disadvantaged groups. Should the model favor an advantaged group for a particular outcome or disproportionally predict incorrectly for a disadvantaged group, causing...