Fundamentals of evasion attacks
Evasion attacks in adversarial AI are sophisticated techniques designed to mislead Machine Learning (ML) models deliberately. They occur during the inference stage, which is when a trained model is used to make predictions. Adversaries craft these attacks by introducing subtle, often imperceptible, perturbations to the input data, aiming to cause the model to err. They do that by targeting the deployed model and the inference API, for instance, the ImRecS app we used in previous chapters.
Typically found in image classification, an evasion attack might involve adding noise to an image that is invisible to the human eye but causes an AI model to misclassify the image. For example, what is clearly an image of a panda to a human observer might be classified as a gibbon by the AI after applying adversarial noise. These perturbations are often optimized by algorithms designed to probe the model’s weaknesses, exploiting gradients (in gradient-based...