Black-box attacks with transferability
So far, we have looked at white- and gray-box attacks where the attackers acquire a knowledge of the target model, including its architecture, sufficient to derive a shadow model.
An alternative approach is to utilize transfer learning for black-box attacks. Black-box attacks are particularly insidious, as they are executed without knowledge of the target model’s parameters, architecture, or training data. Instead, attackers typically have access only to the model’s inputs and outputs. The key to the success of black-box attacks lies in the concept of transferability, where adversarial examples crafted for one model (the surrogate) can also deceive another model (the target).
Attack scenario
The threat model for black-box attacks assumes that attackers have no internal knowledge of the target system. They cannot directly calculate gradients or other model specifics required for crafting adversarial examples. However, they...