Shielding against any evasion attack by adversarial training of a robust classifier
In Chapter 7, Visualizing Convolutional Neural Networks, we identified a garbage image classifier that would likely perform poorly in the intended environment of a municipal recycling plant. The abysmal performance on out-of-sample data was due to the classifier being trained on a large variety of publicly available images that don’t match the expected conditions, or the characteristics of materials that are processed by a recycling plant. The chapter’s conclusion called for training a network with images that represent their intended environment to make for a more robust model.
For model robustness, training data variety is critical, but only if it represents the intended environment. In statistical terms, it’s a question of using samples for training that accurately depict the population so that a model learns to classify them correctly. For adversarial robustness, the same...