In this chapter, we looked at the very popular neural network architecture CapsNet, by Geoff Hinton (presumably the father of deep learning).
We started off by understanding the limitations of CNNs in their current form. They use max pooling as a crutch to achieve invariance in activities. Max pooling has a tendency to lose information, and it can't model the relationships between different objects in the image. We then touched upon how the human brain detects objects and are viewpoint invariant. We drew an analogy to computer graphics and understood how we can probably incorporate pose information in neural networks.
Subsequently, we learned about the basic building blocks of capsule networks, that is, capsules. We understood how they differ from the traditional neuron in that they take a vector as the input and produce a vector output. We also learned about a special...