Stacking Restricted Boltzmann Machines to generate images: the Deep Belief Network
You have seen that an RBM with a single hidden layer can be used to learn a generative model of images; in fact, theoretical work has suggested that with a sufficiently large number of hidden units, an RBM can approximate any distribution with binary values.19 However, in practice, for very large input data, it may be more efficient to add additional layers, instead of a single large layer, allowing a more "compact" representation of the data.
Researchers who developed DBNs also noted that adding additional layers can only lower the log likelihood of the lower bound of the approximation of the data reconstructed by the generative model.20 In this case, the hidden layer output h of the first layer becomes the input to a second RBM; we can keep adding other layers to make a deeper network. Furthermore, if we wanted to make this network capable of learning not only the distribution of the...