Let's start to peek at the data. Open up some of the sample files and be prepared for a shocker—these are neither 28x28 tiles of handwriting nor 64x64 icons with cat faces. This is a real dataset from the real world. In fact, not even the sizes are consistent across images. Welcome to the real world.
You'll find sizes ranging from 2,000 pixels per side to almost 5,000 pixels! This brings us to our first real-life task—creating a training pipeline. The pipeline will be a set of steps that abstract away the ugly realities of life and produce a set of clean and consistent data.