Autoencoders covered so far (except for CAEs) consisted only of a single-layer encoder and a single-layer decoder. However, it is possible for us to have multiple layers in encoder and decoder networks; using deeper encoder and decoder networks can allow the autoencoder to represent complex features. The structure so obtained is called a Stacked Autoencoder (Deep Autoencoders); the features extracted by one encoder are passed on to the next encoder as input. The stacked autoencoder can be trained as a whole network with an aim to minimize the reconstruction error, or each individual encoder/decoder network can be first pretrained using the unsupervised method you learned earlier, and then the complete network is fine-tuned. It has been pointed out that, by pretraining, also called Greedy layer-wise training, the results are better.




















































