ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00334
57
30
v1v2v3v4v5v6v7v8v9v10v11v12 (latest)

A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Samples

1 December 2016
Beilun Wang
Ji Gao
Yanjun Qi
    AAML
ArXiv (abs)PDFHTML
Abstract

Adversarial samples are maliciously created inputs that lead a machine learning classifier to produce incorrect output labels. An adversarial sample is often generated by adding adversarial noise (AN) to a normal test sample. Recent literature has tried to analyze and harden learning-based classifiers under such AN. However, most previous studies are empirical and provide little understanding of the underlying reasons why many machine learning classifiers, including deep neural networks (DNNs), are vulnerable to AN. To fill this gap, we propose a theoretical framework using two topology spaces to understand classifiers' robustness against AN. The central idea of our work is that for a certain classification task, the robustness of a classifier f1f_1f1​ against AN is decided by both f1f_1f1​ and its oracle f2f_2f2​ (such as a human annotator of that specific task). This motivates us to formulate a formal definition of "strong-robustness" that describes when a classifier f1f_1f1​ is always robust against AN according to its f2f_2f2​. The second key piece of our framework is the decomposition of fi=ci∘gif_i = c_i \circ g_i fi​=ci​∘gi​, in which i∈1,2i \in {1,2}i∈1,2, gig_igi​ includes feature learning operations and cic_ici​ includes relatively simple decision functions for the classification. We theoretically prove that f1f_1f1​ is strong-robust against AN ⇔\Leftrightarrow⇔ a special topology relationship exists between the two feature spaces defined by g1g_1g1​ and g2g_2g2​. Surprisingly, our theorems indicate that the strong-robustness of f1f_1f1​ against AN is fully determined by its g1g_1g1​, not c1c_1c1​. Empirically we find that the Siamese architecture can intuitively help DNN models approach topological equivalence between the two feature spaces, which in turns effectively improves its robustness against AN.

View on arXiv
Comments on this paper