v1v2 (latest)

Gradient Methods Provably Converge to Non-Robust Networks

Neural Information Processing Systems (NeurIPS), 2022

9 February 2022

Gal Vardi

Gilad Yehudai

Ohad Shamir

ArXiv (abs)PDF HTML Github

Main:12 Pages

3 Figures

Bibliography:3 Pages

Appendix:11 Pages

Abstract

Despite a great deal of research, it is still unclear why neural networks are so susceptible to adversarial examples. In this work, we identify natural settings where depth- $2$ ReLU networks trained with gradient flow are provably non-robust (susceptible to small adversarial $\ell_2$ -perturbations), even when robust networks that classify the training dataset correctly exist. Perhaps surprisingly, we show that the well-known implicit bias towards margin maximization induces bias towards non-robust networks, by proving that every network which satisfies the KKT conditions of the max-margin problem is non-robust.

View on arXiv

Comments on this paper