Trying SHAP for NLP
Most of SHAP's explainers will work with tabular data. DeepExplainer
can do text but is restricted to deep learning models, and, as we will cover in Chapter 8, Visualizing Convolutional Neural Networks, three of them do images, including KernelExplainer
. In fact, SHAP's KernelExplainer
was designed to be a general-purpose truly model-agnostic method, but it's not promoted as an option for NLP. It easy to understand why: it's slow, and NLP models tend to be very complex and with hundreds—if not thousands—of features to boot. In cases such as this one, where word order is not a factor and you have a few hundred features, but the top 100 are present in most of your observations, KernelExplainer
could work.
In addition to overcoming slowness, there are a couple of technical hurdles you would need to overcome. One of them is that KernelExplainer
is compatible with a pipeline, but it expects a single set of predictions back. But LightGBM...