The GTSRB dataset contains more than 50,000 images of traffic signs belonging to 43 classes.
This dataset was used by professionals in a classification challenge during the International Joint Conference on Neural Networks (IJCNN) in 2011. The GTSRB dataset is perfect for our purposes because it is large, organized, open source, and annotated.
Although the actual traffic sign is not necessarily a square or is in the center of each image, the dataset comes with an annotation file that specifies the bounding boxes for each sign.
A good idea before doing any sort of machine learning is usually to get a feel of the dataset, its qualities, and its challenges. Some good ideas include manually going through the data and understanding what are some characteristics of it, reading a data description—if it's available on the page—to understand...