Finding neurons to interpret
With millions and billions of neurons in today’s SoTA architectures, it’s impossible to interpret every single neuron, and, frankly, a waste of time. The choice of the neuron to explain should depend on your goal. The following list shows some of the different goals and associated methods for choosing suitable neurons:
- Finding out what a certain prediction label or class pattern looks like: In this case, you should simply choose a neuron specific to the prediction of the target label or class. This is usually done to understand whether the model captured the desired patterns of the class well, or whether it learned irrelevant features. This can also be useful in multilabel scenarios where multiple labels always only exist together, and you want to decouple the labels to understand the input patterns associated with a single label better.
- Wanting to understand the latent reasons why a specific label can be predicted in your dataset...