Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.17546
Cited By
v1
v2
v3 (latest)
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
31 October 2022
Daphne Ippolito
Florian Tramèr
Milad Nasr
Chiyuan Zhang
Matthew Jagielski
Katherine Lee
Christopher A. Choquette-Choo
Nicholas Carlini
PILM
MU
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy"
29 / 29 papers shown
Title
Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks
Yixuan Xu
Antoni-Joan Solergibert i Llaquet
Antoine Bosselut
Imanol Schlag
85
0
0
19 May 2025
Empirical Privacy Variance
Yuzheng Hu
Fan Wu
Ruicheng Xian
Yuhang Liu
Lydia Zakynthinou
Pritish Kamath
Chiyuan Zhang
David A. Forsyth
132
0
0
16 Mar 2025
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan
Tianyu Pang
Chao Du
Kejiang Chen
Weiming Zhang
Min Lin
MU
225
13
0
10 Oct 2024
Ward: Provable RAG Dataset Inference via LLM Watermarks
Nikola Jovanović
Robin Staab
Maximilian Baader
Martin Vechev
447
5
0
04 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
178
7
0
03 Oct 2024
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Tomer Ashuach
Martin Tutek
Yonatan Belinkov
MU
KELM
126
7
0
13 Jun 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
154
9
0
05 Apr 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
Aly M. Kassem
Omar Mahmoud
Niloofar Mireshghallah
Hyunwoo J. Kim
Yulia Tsvetkov
Yejin Choi
Sherif Saad
Santu Rana
112
22
0
05 Mar 2024
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
89
141
0
15 Jun 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
110
197
0
22 May 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
531
6,293
0
05 Apr 2022
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
124
630
0
15 Feb 2022
Defending against Reconstruction Attacks with Rényi Differential Privacy
Pierre Stock
I. Shilov
Ilya Mironov
Alexandre Sablayrolles
AAML
SILM
MIACV
65
40
0
15 Feb 2022
What Does it Mean for a Language Model to Preserve Privacy?
Hannah Brown
Katherine Lee
Fatemehsadat Mireshghallah
Reza Shokri
Florian Tramèr
PILM
100
243
0
11 Feb 2022
Reconstructing Training Data with Informed Adversaries
Borja Balle
Giovanni Cherubin
Jamie Hayes
MIACV
AAML
93
171
0
13 Jan 2022
Counterfactual Memorization in Neural Language Models
Chiyuan Zhang
Daphne Ippolito
Katherine Lee
Matthew Jagielski
Florian Tramèr
Nicholas Carlini
91
137
0
24 Dec 2021
Differentially Private Fine-tuning of Language Models
Da Yu
Saurabh Naik
A. Backurs
Sivakanth Gopi
Huseyin A. Inan
...
Y. Lee
Andre Manoel
Lukas Wutschitz
Sergey Yekhanin
Huishuai Zhang
240
371
0
13 Oct 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
360
636
0
14 Jul 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
236
5,665
0
07 Jul 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
313
285
0
02 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
476
2,121
0
31 Dec 2020
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
Basel Alomair
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
512
1,953
0
14 Dec 2020
Training Production Language Models without Memorizing User Data
Swaroop Indra Ramaswamy
Om Thakkar
Rajiv Mathews
Galen Andrew
H. B. McMahan
Franccoise Beaufays
FedML
72
92
0
21 Sep 2020
DART: Open-Domain Structured Data Record to Text Generation
Linyong Nan
Dragomir R. Radev
Rui Zhang
Amrit Rau
Abhinand Sivaprasad
...
Y. Tan
Xi Lin
Caiming Xiong
R. Socher
Nazneen Rajani
60
202
0
06 Jul 2020
MLSUM: The Multilingual Summarization Corpus
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
77
177
0
30 Apr 2020
ToTTo: A Controlled Table-To-Text Generation Dataset
Ankur P. Parikh
Xuezhi Wang
Sebastian Gehrmann
Manaal Faruqui
Bhuwan Dhingra
Diyi Yang
Dipanjan Das
LMTD
104
368
0
29 Apr 2020
The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks
Yuheng Zhang
R. Jia
Hengzhi Pei
Wenxiao Wang
Yue Liu
Basel Alomair
AAML
113
422
0
17 Nov 2019
Semantic Noise Matters for Neural Natural Language Generation
Ondrej Dusek
David M. Howcroft
Verena Rieser
84
118
0
10 Nov 2019
Deep Learning with Differential Privacy
Martín Abadi
Andy Chu
Ian Goodfellow
H. B. McMahan
Ilya Mironov
Kunal Talwar
Li Zhang
FedML
SyDa
216
6,172
0
01 Jul 2016
1