ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.18880
38
2

Can we spot a fake?

24 October 2024
S. Mendelson
G. Paouris
Roman Vershynin
ArXivPDFHTML
Abstract

The problem of detecting fake data inspires the following seemingly simple mathematical question. Sample a data point XXX from the standard normal distribution in Rn\mathbb{R}^nRn. An adversary observes XXX and corrupts it by adding a vector rtrtrt, where they can choose any vector ttt from a fixed set TTT of the adversary's "tricks", and where r>0r>0r>0 is a fixed radius. The adversary's choice of t=t(X)t=t(X)t=t(X) may depend on the true data XXX. The adversary wants to hide the corruption by making the fake data X+rtX+rtX+rt statistically indistinguishable from the real data XXX. What is the largest radius r=r(T)r=r(T)r=r(T) for which the adversary can create an undetectable fake? We show that for highly symmetric sets TTT, the detectability radius r(T)r(T)r(T) is approximately twice the scaled Gaussian width of TTT. The upper bound actually holds for arbitrary sets TTT and generalizes to arbitrary, non-Gaussian distributions of real data XXX. The lower bound may fail for not highly symmetric TTT, but we conjecture that this problem can be solved by considering the focused version of the Gaussian width of TTT, which focuses on the most important directions of TTT.

View on arXiv
Comments on this paper