3. Unsupervised learning by maximizing the Mutual Information of discrete random variables
A classic problem in deep learning is supervised classification. In Chapter 1, Introducing Advanced Deep Learning with Keras, and Chapter 2, Deep Neural Networks, we learned that in supervised classification, we need labeled input images. We performed classification on both the MNIST and CIFAR10 datasets. For MNIST, a 3-layer CNN and a Dense layer can achieve as much as 99.3% accuracy. For CIFAR10, using ResNet or DenseNet, we can achieve about 93% to 94% accuracy. Both MNIST and CIFAR10 are labeled datasets.
Unlike supervised learning, our objective in this chapter is to perform unsupervised learning. Our focus is on classification without labels. The idea is if we learn how to cluster latent code vectors of all training data, then a linear separation algorithm can classify each test input data latent vector.
To learn the clustering of latent code vectors without labels, our training...