Interpretability
Interpretability can be defined as the degree to which a human can understand the cause of a decision. In machine learning and artificial intelligence, that translates to the degree to which someone can understand the how and why of an algorithm and its predictions. There are two ways to look at interpretability—transparency and post hoc interpretation.
Transparency is when the model is inherently simple and can be simulated or thought about using human cognition. A human should be able to fully understand the inputs and the process a model takes to convert these inputs to outputs. This is a very stringent condition that almost none of the model machine learning or deep learning models satisfy.
This is where post hoc interpretation techniques shine. There is a wide variety of techniques that use the inputs and outputs of a model to understand why a model has made the predictions it has.
There are many popular techniques such as permutation feature...