CapsNet for classifying Fashion MNIST images
Now let's take a look at the implementation of CapsNet for classifying Fashion MNIST images. Zalando, the e-commerce company, recently released a new replacement for the MNIST dataset, known as Fashion MNIST (https://github.com/zalandoresearch/fashion-mnist). The Fashion MNIST dataset includes 28 x 28 grayscale images under 10 categories:
Category name | Label (in dataset) |
T-shirt/top | 0 |
Trouser | 1 |
Pullover | 2 |
Dress | 3 |
Coat | 4 |
Sandal | 5 |
Shirt | 6 |
Sneaker | 7 |
Bag | 8 |
Ankle boot | 9 |
The following are some sample images from the dataset:
The training set contains 60K examples, and the test set contains 10K examples.
CapsNet implementation
The CapsNet architecture consists of two parts, each consisting of three layers. The first three layers are encoders, while the next three layers are decoders:
Layer Num | Layer Name | Layer Type |
1 | Convolutional Layer | Encoder |
2 | PrimaryCaps Layer | Encoder |
3 | DigitCaps Layer | Encoder |
4 | Fully Connected Layer 1 | Decoder |
5 | Fully Connected Layer 2 | Decoder |
6 | Fully Connecter Layer 3 | Decoder... |