Summary
In this chapter, the concept of adversarial performance analysis for machine learning models was introduced. Adversarial attacks aim to deceive models by intentionally inputting misleading or carefully crafted data to cause incorrect predictions. This chapter highlighted the importance of analyzing adversarial performance to identify potential vulnerabilities and weaknesses in machine learning models and to develop targeted mitigation methods. Adversarial attacks can target various aspects of machine learning models, which include their bias and fairness behavior, and their accuracy-based performance. For instance, facial recognition systems may be targeted by adversaries who exploit biases or discrimination present in the training data or model design.
We also explored practical examples and techniques for analyzing adversarial performance in image, text, and audio data-based models. For image-based models, various approaches such as object size, orientation, blurriness...