Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.06421
Cited By
v1
v2 (latest)
When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?
Symposium on the Theory of Computing (STOC), 2020
11 December 2020
Gavin Brown
Mark Bun
Vitaly Feldman
Adam D. Smith
Kunal Talwar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?"
50 / 92 papers shown
Extracting alignment data in open models
Federico Barbero
Xiangming Gu
Christopher A. Choquette-Choo
Chawin Sitawarin
Matthew Jagielski
Itay Yona
Petar Velickovic
Ilia Shumailov
Jamie Hayes
314
4
0
21 Oct 2025
AI Agents as Universal Task Solvers
Alessandro Achille
Stefano Soatto
LRM
189
3
1
14 Oct 2025
A Law of Data Reconstruction for Random Features (and Beyond)
Leonardo Iurada
Simone Bombari
Tatiana Tommasi
Marco Mondelli
195
0
0
26 Sep 2025
Efficiently Attacking Memorization Scores
Tue Do
Varun Chandrasekaran
Daniel Alabi
TDI
AAML
322
0
0
24 Sep 2025
Synth-MIA: A Testbed for Auditing Privacy Leakage in Tabular Data Synthesis
Joshua Ward
Xiaofeng Lin
Chi-Hua Wang
Guang Cheng
189
6
0
22 Sep 2025
Access Paths for Efficient Ordering with Large Language Models
Fuheng Zhao
Jiayue Chen
Yiming Pan
Tahseen Rabbani
D. Agrawal
D. Agrawal
A. El Abbadi
Paritosh Aggarwal
Anupam Datta
Dimitris Tsirogiannis
239
1
0
30 Aug 2025
Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks
Zhiwen Ruan
Yun-Nung Chen
Yutao Hou
Peng Li
Yang Liu
Guanhua Chen
290
3
0
06 Aug 2025
A Common Pool of Privacy Problems: Legal and Technical Lessons from a Large-Scale Web-Scraped Machine Learning Dataset
Rachel Hong
Jevan Hutson
William Agnew
Imaad Huda
Tadayoshi Kohno
Jamie Morgenstern
AILaw
SILM
PILM
452
6
0
20 Jun 2025
Black-Box Privacy Attacks on Shared Representations in Multitask Learning
John Abascal
Nicolás Berrios
Alina Oprea
Jonathan R. Ullman
Adam D. Smith
Matthew Jagielski
MLAU
290
0
0
19 Jun 2025
Memorization in Language Models through the Lens of Intrinsic Dimension
Stefan Arnold
PILM
390
4
0
11 Jun 2025
Trade-offs in Data Memorization via Strong Data Processing Inequalities
Annual Conference Computational Learning Theory (COLT), 2025
Vitaly Feldman
Guy Kornowski
Xin Lyu
TDI
FedML
497
5
0
02 Jun 2025
How much do language models memorize?
John X. Morris
Chawin Sitawarin
Chuan Guo
Narine Kokhlikyan
G. E. Suh
Alexander M. Rush
Kamalika Chaudhuri
Saeed Mahloujifar
KELM
ELM
469
37
0
30 May 2025
Bayesian Perspective on Memorization and Reconstruction
Haim Kaplan
Yishay Mansour
Kobbi Nissim
Uri Stemmer
AAML
338
0
0
29 May 2025
Querying Kernel Methods Suffices for Reconstructing their Training Data
Daniel Barzilai
Yuval Margalit
Eitan Gronich
Gilad Yehudai
Meirav Galun
Ronen Basri
252
0
0
25 May 2025
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Minki Kang
Jongwon Jeong
Jaewoong Cho
ALM
LRM
389
9
0
07 Apr 2025
Trustworthy Machine Learning via Memorization and the Granular Long-Tail: A Survey on Interactions, Tradeoffs, and Beyond
Qiongxiu Li
Xiaoyu Luo
Yiyi Chen
Johannes Bjerva
594
8
0
10 Mar 2025
Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data
Henrik Nolte
Michèle Finck
Kristof Meding
AILaw
PILM
552
4
0
03 Mar 2025
The Pitfalls of Memorization: When Memorization Hurts Generalization
International Conference on Learning Representations (ICLR), 2024
Reza Bayat
Mohammad Pezeshki
Elvis Dohmatob
David Lopez-Paz
Pascal Vincent
OOD
463
18
0
10 Dec 2024
Improved Localized Machine Unlearning Through the Lens of Memorization
Reihaneh Torkzadehmahani
Reza Nasirigerdeh
Georgios Kaissis
Daniel Rueckert
Gintare Karolina Dziugaite
Eleni Triantafillou
MU
287
7
0
03 Dec 2024
Slowing Down Forgetting in Continual Learning
Pascal Janetzky
Tobias Schlagenhauf
Stefan Feuerriegel
CLL
475
0
0
11 Nov 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
710
27
0
03 Oct 2024
Range Membership Inference Attacks
Jiashu Tao
Reza Shokri
510
10
0
09 Aug 2024
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELM
PILM
MU
382
51
0
25 Jul 2024
A Survey on Machine Unlearning: Techniques and New Emerged Privacy Risks
Journal of Information Security and Applications (JISA), 2024
Hengzhu Liu
Ping Xiong
Tianqing Zhu
Philip S. Yu
272
25
0
10 Jun 2024
Data Reconstruction: When You See It and When You Don't
Edith Cohen
Haim Kaplan
Yishay Mansour
Shay Moran
Kobbi Nissim
Uri Stemmer
Eliad Tsfadia
AAML
350
9
0
24 May 2024
Exploring prompts to elicit memorization in masked language model-based named entity recognition
PLoS ONE (PLoS ONE), 2024
Yuxi Xia
Anastasiia Sedova
Pedro Henrique Luz de Araujo
Vasiliki Kougia
Lisa Nussbaumer
Benjamin Roth
316
1
0
05 May 2024
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao
Yu Wang
285
0
0
11 Apr 2024
Gradient Descent is Pareto-Optimal in the Oracle Complexity and Memory Tradeoff for Feasibility Problems
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2024
Moise Blanchard
298
1
0
10 Apr 2024
Unveiling Privacy, Memorization, and Input Curvature Links
Deepak Ravikumar
Efstathia Soufleri
Abolfazl Hashemi
Kaushik Roy
345
15
0
28 Feb 2024
Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization
Idan Attias
Gintare Karolina Dziugaite
Mahdi Haghifam
Roi Livni
Daniel M. Roy
376
11
0
14 Feb 2024
Do LLMs Dream of Ontologies?
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2024
Marco Bombieri
Paolo Fiorini
Simone Paolo Ponzetto
M. Rospocher
CLL
408
7
0
26 Jan 2024
Memorization in Self-Supervised Learning Improves Downstream Generalization
Wenhao Wang
Muhammad Ahmad Kaleem
Adam Dziedzic
Michael Backes
Nicolas Papernot
Franziska Boenisch
SSL
453
19
0
19 Jan 2024
The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline
Haonan Wang
Qianli Shen
Yao Tong
Yang Zhang
Kenji Kawaguchi
336
45
0
07 Jan 2024
SoK: Unintended Interactions among Machine Learning Defenses and Risks
Vasisht Duddu
S. Szyller
Nadarajah Asokan
AAML
422
6
0
07 Dec 2023
Differentially Private Non-Convex Optimization under the KL Condition with Optimal Rates
International Conference on Algorithmic Learning Theory (ALT), 2023
Michael Menart
Enayat Ullah
Raman Arora
Raef Bassily
Cristóbal Guzmán
370
3
0
22 Nov 2023
On Retrieval Augmentation and the Limitations of Language Model Training
Ting-Rui Chiang
Xinyan Velocity Yu
Joshua Robinson
Ollie Liu
Isabelle Lee
Dani Yogatama
RALM
264
2
0
16 Nov 2023
Privacy Threats in Stable Diffusion Models
Thomas Cilloni
Charles Fleming
Charles Walter
262
5
0
15 Nov 2023
SoK: Memorisation in machine learning
Dmitrii Usynin
Moritz Knolle
Georgios Kaissis
365
1
0
06 Nov 2023
Why Train More? Effective and Efficient Membership Inference via Memorization
Jihye Choi
Shruti Tople
Varun Chandrasekaran
Somesh Jha
TDI
FedML
291
3
0
12 Oct 2023
What do larger image classifiers memorise?
Michal Lukasik
Vaishnavh Nagarajan
A. S. Rawat
A. Menon
Sanjiv Kumar
292
6
0
09 Oct 2023
Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization
Neural Information Processing Systems (NeurIPS), 2023
Adel Javanmard
Vahab Mirrokni
484
3
0
06 Oct 2023
Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
Neural Information Processing Systems (NeurIPS), 2023
G. Buzaglo
Niv Haim
Gilad Yehudai
Gal Vardi
Yakir Oz
Yaniv Nikankin
Michal Irani
316
25
0
04 Jul 2023
Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models
Adel M. Elmahdy
A. Salem
SILM
362
8
0
23 Jun 2023
Memory-Query Tradeoffs for Randomized Convex Optimization
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2023
Xinyu Chen
Binghui Peng
320
8
0
21 Jun 2023
Machine Unlearning: A Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Heng Xu
Tianqing Zhu
Lefeng Zhang
Wanlei Zhou
Philip S. Yu
MU
318
48
0
06 Jun 2023
TMI! Finetuned Models Leak Private Information from their Pretraining Data
Proceedings on Privacy Enhancing Technologies (PoPETs), 2023
John Abascal
Stanley Wu
Alina Oprea
Jonathan R. Ullman
348
23
0
01 Jun 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Neural Information Processing Systems (NeurIPS), 2023
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALM
LRM
339
102
0
28 May 2023
Private Everlasting Prediction
Neural Information Processing Systems (NeurIPS), 2023
M. Naor
Kobbi Nissim
Uri Stemmer
Chao Yan
321
5
0
16 May 2023
AI Model Disgorgement: Methods and Choices
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2023
Alessandro Achille
Michael Kearns
Carson Klingenberg
Stefano Soatto
MU
265
17
0
07 Apr 2023
Near Optimal Memory-Regret Tradeoff for Online Learning
IEEE Annual Symposium on Foundations of Computer Science (FOCS), 2023
Binghui Peng
A. Rubinstein
CLL
413
14
0
03 Mar 2023
1
2
Next
Page 1 of 2