Much like RBMs, autoencoders are a class of unsupervised learning algorithms that aim to uncover the hidden structures within data. In principal component analysis (PCA), we try to capture the linear relationships among input variables, and try to represent the data in a reduced dimension space by taking linear combinations (of the input variables) that account for most of the variance in data. However, PCA would not be able to capture the nonlinear relationships between the input variables.
Autoencoders are neural networks that can capture the nonlinear interactions between input variables while representing the input in different dimensions in a hidden layer. Most of the time, the dimensions of the hidden layer are smaller to those of the input. This we skipped, with the assumption that there is an inherent low-dimensional structure to the high-dimensional data. For instance, high-dimensional images can be represented by a low-dimensional manifold, and autoencoders are often used to discover that structure. The following diagram illustrates the neural architecture of an autoencoder:
An autoencoder has two parts: an encoder and a decoder. The encoder tries to project the input data, x, into a hidden layer, h. The decoder tries to reconstruct the input from the hidden layer h. The weights accompanying such a network are trained by minimizing the reconstruction error that is, the error between the reconstructed input, , from the decoder and the original input. If the input is continuous, then the sum of squares of the reconstruction error is minimized, in order to learn the weights of the autoencoder.
If we represent the encoder by a function, fW (x), and the decoder by fU (x), where W and U are the weight matrices associated with the encoder and the decoder, then the following is the case:
(1)
(2)
The reconstruction error, C, over the training set, xi, i = 1, 2, 3, ...m, can be expressed as follows:
(3)
The autoencoder optimal weights, , can be learned by minimizing the cost function from (3), as follows:
(4)
Autoencoders are used for a variety of purposes, such as learning the latent representation of data, noise reduction, and feature detection. Noise reduction autoencoders take the noisy version of the actual input as their input. They try to construct the actual input that acts as a label for the reconstruction. Similarly, autoencoders can be used as generative models. One such class of autoencoders that can work as generative models is called variational autoencoders. Currently, variational autoencoders and GANs are very popular as generative models for image processing.