Identifying handwritten mathematical symbols with CNNs
This sections deals with building a CNN to identify handwritten mathematical symbols. We're going to use the HASYv2
dataset. This contains 168,000 images from 369 different classes where each represents a different symbol. This dataset is a more complex analog compared to the popular MNIST dataset, which contains handwritten numbers.
The following diagram depicts the kind of images that are available in this dataset:
![](https://static.packt-cdn.com/products/9781789539462/graphics/a541bd0c-af08-43a2-a879-a68c2e947ccd.png)
And here, we can see a graph showing how many symbols have different numbers of images:
![](https://static.packt-cdn.com/products/9781789539462/graphics/761fb11c-5024-4192-9a33-dba20953e2e4.png)
It is observed that many symbols have few images and there are a few that have lots of images. The code to import any image is as follows:
![](https://static.packt-cdn.com/products/9781789539462/graphics/680a92a3-ce7a-4ec5-9364-4e71c3638f4d.png)
We begin by importing the Image
class from the IPython
library. This allows us to show images inside Jupyter Notebook. Here's one image from the dataset:
![](https://static.packt-cdn.com/products/9781789539462/graphics/a172218a-3bd7-41e0-af55-9de3ac9dada8.png)
This is an image of the alphabet A. Each image is 30 x 30 pixels. This image is in the RGB format even though it doesn't really need to be RGB. The...