v1v2v3 (latest)

Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations

9 February 2022

Lei Hsiung

Tsung-Yi Ho

Abstract

Model robustness against adversarial examples of single perturbation type such as the $\ell_{p}$ -norm has been widely studied, yet its generalization to more realistic scenarios involving multiple semantic perturbations and their composition remains largely unexplored. In this paper, we first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent and automatic attack-order scheduling. We then propose generalized adversarial training (GAT) to extend model robustness from $\ell_{p}$ -ball to composite semantic perturbations, such as the combination of Hue, Saturation, Brightness, Contrast, and Rotation. Results obtained using ImageNet and CIFAR-10 datasets indicate that GAT can be robust not only to all the tested types of a single attack, but also to any combination of such attacks. GAT also outperforms baseline $\ell_{\infty}$ -norm bounded adversarial training approaches by a significant margin.

View on arXiv

Comments on this paper