Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.07805
Cited By
Extracting Training Data from Large Language Models
14 December 2020
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
Katherine Lee
Adam Roberts
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extracting Training Data from Large Language Models"
50 / 359 papers shown
Title
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems
Songbo Hu
Ivan Vulić
Fangyu Liu
Anna Korhonen
32
0
0
07 Nov 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
32
79
0
31 Oct 2022
Two Models are Better than One: Federated Learning Is Not Private For Google GBoard Next Word Prediction
Mohamed Suliman
D. Leith
SILM
FedML
18
7
0
30 Oct 2022
LegoNet: A Fast and Exact Unlearning Architecture
Sihao Yu
Fei Sun
J. Guo
Ruqing Zhang
Xueqi Cheng
MU
37
7
0
28 Oct 2022
Privately Fine-Tuning Large Language Models with Differential Privacy
R. Behnia
Mohammadreza Ebrahimi
Jason L. Pacheco
B. Padmanabhan
24
44
0
26 Oct 2022
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe
Xiang Yue
Huseyin A. Inan
Xuechen Li
Girish Kumar
Julia McAnallen
Hoda Shajari
Huan Sun
David Levitan
Robert Sim
36
79
0
25 Oct 2022
Exploring Mode Connectivity for Pre-trained Language Models
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
27
20
0
25 Oct 2022
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
Vikas Raunak
Arul Menezes
30
13
0
24 Oct 2022
DPIS: An Enhanced Mechanism for Differentially Private SGD with Importance Sampling
Jianxin Wei
Ergute Bao
X. Xiao
Y. Yang
39
20
0
18 Oct 2022
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae
Donghyun Kwak
Soyoung Kang
Min Young Lee
Sungdong Kim
Yuin Jeong
Hyeri Kim
Sang-Woo Lee
W. Park
Nako Sung
38
46
0
17 Oct 2022
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Evan Crothers
Nathalie Japkowicz
H. Viktor
DeLMO
29
107
0
13 Oct 2022
Understanding Transformer Memorization Recall Through Idioms
Adi Haviv
Ido Cohen
Jacob Gidron
R. Schuster
Yoav Goldberg
Mor Geva
26
48
0
07 Oct 2022
FAST: Improving Controllability for Text Generation with Feedback Aware Self-Training
Junyi Chai
Reid Pryzant
Victor Ye Dong
Konstantin Golobokov
Chenguang Zhu
Yi Liu
29
5
0
06 Oct 2022
CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning
Samuel Maddock
Alexandre Sablayrolles
Pierre Stock
FedML
12
22
0
06 Oct 2022
Differentially Private Optimization on Large Model at Small Cost
Zhiqi Bu
Yu-Xiang Wang
Sheng Zha
George Karypis
30
52
0
30 Sep 2022
Membership Inference Attacks and Generalization: A Causal Perspective
Teodora Baluta
Shiqi Shen
S. Hitarth
Shruti Tople
Prateek Saxena
OOD
MIACV
40
18
0
18 Sep 2022
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
37
26
0
15 Sep 2022
Are Attribute Inference Attacks Just Imputation?
Bargav Jayaraman
David E. Evans
TDI
MIACV
23
46
0
02 Sep 2022
Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP
Johann Frei
Frank Kramer
21
1
0
30 Aug 2022
Differential Privacy in Natural Language Processing: The Story So Far
Oleksandra Klymenko
Stephen Meisenbacher
Florian Matthes
26
15
0
17 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
34
3
0
11 Aug 2022
Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices
Mingbin Xu
Congzheng Song
Ye Tian
Neha Agrawal
Filip Granqvist
...
Shiyi Han
Yaqiao Deng
Leo Liu
Anmol Walia
Alex Jin
FedML
13
22
0
18 Jul 2022
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
Peter Henderson
M. Krass
Lucia Zheng
Neel Guha
Christopher D. Manning
Dan Jurafsky
Daniel E. Ho
AILaw
ELM
129
97
0
01 Jul 2022
Measuring Forgetting of Memorized Training Examples
Matthew Jagielski
Om Thakkar
Florian Tramèr
Daphne Ippolito
Katherine Lee
...
Eric Wallace
Shuang Song
Abhradeep Thakurta
Nicolas Papernot
Chiyuan Zhang
TDI
50
102
0
30 Jun 2022
The Privacy Onion Effect: Memorization is Relative
Nicholas Carlini
Matthew Jagielski
Chiyuan Zhang
Nicolas Papernot
Andreas Terzis
Florian Tramèr
PILM
MIACV
30
99
0
21 Jun 2022
Insights into Pre-training via Simpler Synthetic Tasks
Yuhuai Wu
Felix Li
Percy Liang
AIMat
24
20
0
21 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
40
132
0
15 Jun 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
48
2,333
0
15 Jun 2022
Self-Supervised Pretraining for Differentially Private Learning
Arash Asadian
Evan Weidner
Lei Jiang
PICV
25
3
0
14 Jun 2022
Challenges in Applying Explainability Methods to Improve the Fairness of NLP Models
Esma Balkir
S. Kiritchenko
I. Nejadgholi
Kathleen C. Fraser
21
36
0
08 Jun 2022
Chefs' Random Tables: Non-Trigonometric Random Features
Valerii Likhosherstov
K. Choromanski
Kumar Avinava Dubey
Frederick Liu
Tamás Sarlós
Adrian Weller
31
17
0
30 May 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
251
565
0
29 May 2022
A Blessing of Dimensionality in Membership Inference through Regularization
Jasper Tan
Daniel LeJeune
Blake Mason
Hamid Javadi
Richard G. Baraniuk
23
18
0
27 May 2022
TempLM: Distilling Language Models into Template-Based Generators
Tianyi Zhang
Mina Lee
Lisa Li
Ende Shen
Tatsunori B. Hashimoto
VLM
32
5
0
23 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
21
185
0
22 May 2022
Learning to Reverse DNNs from AI Programs Automatically
Simin Chen
Hamed Khanpour
Cong Liu
Wei Yang
35
15
0
20 May 2022
Recovering Private Text in Federated Learning of Language Models
Samyak Gupta
Yangsibo Huang
Zexuan Zhong
Tianyu Gao
Kai Li
Danqi Chen
FedML
25
74
0
17 May 2022
How to Combine Membership-Inference Attacks on Multiple Updated Models
Matthew Jagielski
Stanley Wu
Alina Oprea
Jonathan R. Ullman
Roxana Geambasu
21
10
0
12 May 2022
Provably Confidential Language Modelling
Xuandong Zhao
Lei Li
Yu-Xiang Wang
MU
14
15
0
04 May 2022
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLM
OCL
20
0
0
27 Apr 2022
Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets
Florian Tramèr
Reza Shokri
Ayrton San Joaquin
Hoang Minh Le
Matthew Jagielski
Sanghyun Hong
Nicholas Carlini
MIACV
27
106
0
31 Mar 2022
Generating High Fidelity Data from Low-density Regions using Diffusion Models
Vikash Sehwag
C. Hazirbas
Albert Gordo
Firat Ozgenel
Cristian Canton Ferrer
DiffM
33
66
0
31 Mar 2022
Do Language Models Plagiarize?
Jooyoung Lee
Thai Le
Jinghui Chen
Dongwon Lee
25
73
0
15 Mar 2022
Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)
Jiayuan Ye
Reza Shokri
FedML
22
44
0
10 Mar 2022
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks
Fatemehsadat Mireshghallah
Kartik Goyal
Archit Uniyal
Taylor Berg-Kirkpatrick
Reza Shokri
MIALM
30
151
0
08 Mar 2022
Towards a Responsible AI Development Lifecycle: Lessons From Information Security
Erick Galinkin
SILM
11
6
0
06 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
311
11,915
0
04 Mar 2022
Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning
Hao He
Kaiwen Zha
Dina Katabi
AAML
28
32
0
22 Feb 2022
When BERT Meets Quantum Temporal Convolution Learning for Text Classification in Heterogeneous Computing
Chao-Han Huck Yang
Jun Qi
Samuel Yen-Chi Chen
Yu Tsao
Pin-Yu Chen
11
50
0
17 Feb 2022
Improved Differential Privacy for SGD via Optimal Private Linear Operators on Adaptive Streams
S. Denisov
H. B. McMahan
J. Rush
Adam D. Smith
Abhradeep Thakurta
FedML
25
59
0
16 Feb 2022
Previous
1
2
3
4
5
6
7
8
Next