Before we deep-dive into the code, remember how most machine learning efforts involve one of two simple goals—classification or ranking. In many cases, the classification is itself a ranking because we end up choosing the classification with the greatest rank (often a probability). Our foray into medical imaging will be no different—we will be classifying images into either of these binary categories:
- Disease state/positive
- Normal state/negative
Or, we will classify them into multiple classes or rank them. In the case of the diabetic retinopathy, we'll rank them as follows:
- Class 0: No Diabetic Retinopathy
- Class 1: Mild
- Class 2: Moderate
- Class 3: Severe
- Class 4: Widespread Diabetic Retinopathy
Often, this is called scoring. Kaggle kindly provides participants over 32 GB of training data, which includes over 35,000 images. The test data is even...