Enhancing explainability via a classifier-invariant approach
Now, we will explore recipes that will allow us to understand the decisions made by text classifiers. We will explore techniques that will use a sentiment classifier and NLP explainability libraries to interpret the classification labels and their relation to the input text, especially in the aspect of individual words in the text.
Though a lot of the current models for text classification in NLP are based on deep neural networks, it is difficult to interpret the results of classification via the network weights or parameters. It is equally challenging to map these network parameters to the individual components or words in the input. However, there are still a few techniques in the NLP space to help us understand the decisions made by the classifier. We will explore these techniques in the current recipe and the following one.
In this recipe, we will learn how to interpret the feature importance of each word in a text...