76
1

On the Dichotomy Between Privacy and Traceability in p\ell_p Stochastic Convex Optimization

Main:11 Pages
Bibliography:5 Pages
1 Tables
Appendix:36 Pages
Abstract

In this paper, we investigate the necessity of memorization in stochastic convex optimization (SCO) under p\ell_p geometries. Informally, we say a learning algorithm memorizes mm samples (or is mm-traceable) if, by analyzing its output, it is possible to identify at least mm of its training samples. Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO. For every p[1,)p\in [1,\infty), we establish the existence of a risk threshold below which any sample-efficient learner must memorize a \em{constant fraction} of its sample. For p[1,2]p\in [1,2], this threshold coincides with best risk of differentially private (DP) algorithms, i.e., above this threshold, there are algorithms that do not memorize even a single sample. This establishes a sharp dichotomy between privacy and traceability for p[1,2]p \in [1,2]. For p(2,)p \in (2,\infty), this threshold instead gives novel lower bounds for DP learning, partially closing an open problem in this setup. En route of proving these results, we introduce a complexity notion we term \em{trace value} of a problem, which unifies privacy lower bounds and traceability results, and prove a sparse variant of the fingerprinting lemma.

View on arXiv
@article{voitovych2025_2502.17384,
  title={ On Traceability in $\ell_p$ Stochastic Convex Optimization },
  author={ Sasha Voitovych and Mahdi Haghifam and Idan Attias and Gintare Karolina Dziugaite and Roi Livni and Daniel M. Roy },
  journal={arXiv preprint arXiv:2502.17384},
  year={ 2025 }
}
Comments on this paper