Problem
NLP explainability typically involves removing or masking random words from a dataset. Extending the same Amazon Customer Reviews dataset, we will review an NLP anomaly detection example using Cleanlab, https://github.com/cleanlab/cleanlab, an open source library, to find potential label errors for text data. Then, we will use SHAP, https://github.com/slundberg/shap, to evaluate post hoc local explainability for model predictions by visualizing feature attributions of individual classes based on computed SHAP values.
Post hoc local explainability means assessing how a particular decision or prediction is made after model training. Using a fine-tuned bidirectional encoder representations from transformers (BERT) model, we will classify positive versus negative sentiments for the Amazon Customer Reviews dataset and compare predicted label errors.
The following section provides an end-to-end solution walk-through.