The GTSRB dataset
In order to apply our classifier to traffic sign recognition, we need a suitable dataset. A good choice might be the German Traffic Sign Recognition Benchmark (GTSRB), which contains more than 50,000 images of traffic signs belonging to more than 40 classes. This is a challenging dataset that was used by professionals in a classification challenge during the International Joint Conference on Neural Networks (IJCNN) 2011. The dataset can be freely obtained from http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset.
The GTSRB dataset is perfect for our purposes because it is large, organized, open source, and annotated. However, for the purpose of this book, we will limit the classification to data samples from a total of 10 classes.
Although the actual traffic sign is not necessarily a square, or centered, in each image, the dataset comes with an annotation file that specifies the bounding boxes for each sign.
A good idea before doing any sort of machine learning...