Recent works have shown that neural networks are vulnerable to adversarial examples; seemingly imperceptible perturbations to data can lead to misbehavior of the model, such as misclassifications of the output. Many researchers proposed adversarial attack and defense mechanisms to counter these vulnerabilities. While these works provide an initial foundation for adversarial training, there are no guarantees on whether proposed white-box attacks can find the most adversarial perturbation and whether there is a class of attacks such defenses can successfully prevent. On the other hand, verification of deep networks using SMT (satisfiability modulo theories) solvers provides formal guarantees on robustness but is NP-hard in general. This approach requires prohibitive computational expense even on small networks.
The authors take the perspective of distributionally robust optimization and provide an adversarial training procedure with provable guarantees on its computational and statistical performance.
This paper proposes a principled methodology to induce distributional robustness in trained neural nets with the purpose of mitigating the impact of adversarial examples.
The idea is to train the model to perform well not only with respect to the unknown population distribution, but to perform well on the worst-case distribution in a Wasserstein ball around the population distribution. In particular, the authors adopt the Wasserstein distance to define the ambiguity sets. This allows them to use strong duality results from the literature on distributionally robust optimization and express the empirical minimax problem as a regularized ERM (empirical risk minimization) with a different cost.
Overall Score: 27/30
Average Score: 9
The reviewers have strongly accepted this paper and have stated that it is of a great quality and originality. They said that this paper is an interesting attempt, but some of the key claims seem to be inaccurate and miss comparison to proper baselines.
Another reviewer said, the paper applies recently developed ideas in the literature of robust optimization, in particular distributionally robust optimization with Wasserstein metric, and showed that under this framework for smooth loss functions when not too much robustness is requested, then the resulting optimization problem is of the same difficulty level as the original one (where the adversarial attack is not concerned).
The paper has also received some criticisms but at the end of all it is majorly liked by many of the reviewers.