1. Mutual Information
Mutual Information is a measure of dependency between two random variables, X and Y. Sometimes, MI is also defined as the amount of information about X through observing Y. MI is also known as information gain or reduction in the uncertainty of X upon observing Y.
In contrast with correlation, MI can measure non-linear statistical dependence between X and Y. In deep learning, MI is a suitable method since most real-world data is unstructured and the dependency between input and output is generally non-linear. In deep learning, the end goal is to perform specific tasks such as classification, translation, regression, or detection on input data and a pre-trained model. These tasks are also known as downstream tasks.
Since MI can uncover important aspects of dependencies in inputs, intermediate features, representation, and outputs, which are random variables themselves, shared information generally improves the performance of models in downstream&...