Towards Assessment of Randomized Mechanisms for Certifying Adversarial Robustness

15 May 2020

Tianhang Zheng

Abstract

As a certified defensive technique, randomized smoothing has received considerable attention due to its scalability to large datasets and neural networks. However, several important questions remain unanswered, such as (i) whether the Gaussian mechanism is an appropriate option for certifying $\ell_2$ -norm robustness, and (ii) whether there is an appropriate randomized mechanism to certify $\ell_\infty$ -norm robustness for high-dimensional datasets. To shed light on these questions, the main difficulty is how to assess each randomized mechanism. In this paper, we propose a generic framework, which connects the existing frameworks in \cite{lecuyer2018certified, li2019certified}, to assess randomized mechanisms. Under our framework, for a mechanism which can certify a certain extent of robustness, we define the magnitude ({\em i.e.,} the expected $\ell_\infty$ norm) of the randomized noise it adds as the metric for assessing its appropriateness. We also derive lower bounds on the metric for $\ell_2$ -norm and $\ell_\infty$ -norm cases as the criteria for assessment. Based on our framework, we assess the Gaussian and Exponential mechanisms by comparing the magnitude of noise added by these mechanisms and the corresponding criteria. We first conclude that the Gaussian mechanism is an appropriate option to certify $\ell_2$ -norm robustness. Moreover, surprisingly, we also show that the Gaussian mechanism is also an appropriate option for certifying $\ell_p$ -norm robustness for any $p\geq2$ (including $\ell_\infty$ -norm). Finally, we verify our theoretical results by evaluations on CIFAR10 and ImageNet.

View on arXiv

Comments on this paper