Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.11158
Cited By
Emergent and Predictable Memorization in Large Language Models
21 April 2023
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emergent and Predictable Memorization in Large Language Models"
50 / 94 papers shown
Title
Automatic Calibration for Membership Inference Attack on Large Language Models
Saleh Zare Zade
Yao Qiang
Xiangyu Zhou
Hui Zhu
Mohammad Amin Roshani
Prashant Khanduri
Dongxiao Zhu
37
1
0
06 May 2025
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
Tong Chen
Faeze Brahman
Jiacheng Liu
Niloofar Mireshghallah
Weijia Shi
Pang Wei Koh
Luke Zettlemoyer
Hannaneh Hajishirzi
40
0
0
20 Apr 2025
Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models
Vishakh Padmakumar
Chen Yueh-Han
Jane Pan
Valerie Chen
He He
35
0
0
13 Apr 2025
SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning
Tianyang Xu
Xiaoze Liu
Feijie Wu
Xiaoqian Wang
Jing Gao
MU
58
0
0
29 Mar 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
90
41
0
25 Mar 2025
Language Models May Verbatim Complete Text They Were Not Explicitly Trained On
Ken Ziyu Liu
Christopher A. Choquette-Choo
Matthew Jagielski
Peter Kairouz
Sanmi Koyejo
Percy Liang
Nicolas Papernot
55
0
0
21 Mar 2025
Empirical Privacy Variance
Yuzheng Hu
Fan Wu
Ruicheng Xian
Yuhang Liu
Lydia Zakynthinou
Pritish Kamath
Chiyuan Zhang
David A. Forsyth
64
0
0
16 Mar 2025
Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models
Zhenguang Liu
Chao Shuai
Shaojing Fan
Ziping Dong
Jinwu Hu
Zhongjie Ba
Kui Ren
WIGM
45
0
0
14 Mar 2025
Privacy Auditing of Large Language Models
Ashwinee Panda
Xinyu Tang
Milad Nasr
Christopher A. Choquette-Choo
Prateek Mittal
PILM
62
5
0
09 Mar 2025
Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data
Henrik Nolte
Michèle Finck
Kristof Meding
AILaw
PILM
77
0
0
03 Mar 2025
Large Language Model Distilling Medication Recommendation Model
Qidong Liu
Xian Wu
Xiangyu Zhao
Yuanshao Zhu
Zijian Zhang
Feng Tian
Yefeng Zheng
LM&MA
94
16
0
28 Jan 2025
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
56
0
0
21 Jan 2025
Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization
Yi-Fu Fu
Yu-Chieh Tu
Tzu-Ling Cheng
Cheng-Yu Lin
Yi-Ting Yang
Heng-Yi Liu
Keng-Te Liao
Da-Cheng Juan
Shou-de Lin
49
0
0
24 Dec 2024
Understanding and Mitigating Memorization in Diffusion Models for Tabular Data
Zhengyu Fang
Zhimeng Jiang
Huiyuan Chen
Xiao Li
Jing Li
79
2
0
15 Dec 2024
Detecting Memorization in Large Language Models
Eduardo Slonski
78
0
0
02 Dec 2024
Achieving Domain-Independent Certified Robustness via Knowledge Continuity
Alan Sun
Chiyu Ma
Kenneth Ge
Soroush Vosoughi
36
0
0
03 Nov 2024
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
53
20
0
30 Oct 2024
Learning and Unlearning of Fabricated Knowledge in Language Models
Chen Sun
Nolan Miller
A. Zhmoginov
Max Vladymyrov
Mark Sandler
KELM
MU
30
1
0
29 Oct 2024
Reasoning, Memorization, and Fine-Tuning Language Models for Non-Cooperative Games
Yunhao Yang
Leonard Berthellemy
Ufuk Topcu
LLMAG
LRM
29
0
0
18 Oct 2024
Decoding Secret Memorization in Code LLMs Through Token-Level Characterization
Yuqing Nie
Chong Wang
Kaidi Wang
Guoai Xu
Guosheng Xu
Haoyu Wang
OffRL
136
1
0
11 Oct 2024
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
45
1
0
04 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
74
7
0
03 Oct 2024
Quantifying Generalization Complexity for Large Language Models
Zhenting Qi
Hongyin Luo
Xuliang Huang
Zhuokai Zhao
Yibo Jiang
Xiangjun Fan
Himabindu Lakkaraju
James Glass
LRM
ELM
28
5
0
02 Oct 2024
Predicting and analyzing memorization within fine-tuned Large Language Models
Jérémie Dentan
Davide Buscaldi
A. Shabou
Sonia Vanier
35
0
0
27 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
54
1
0
05 Sep 2024
Reasoning and Tools for Human-Level Forecasting
Elvis Hsieh
Preston Fu
Jonathan Chen
ReLM
LLMAG
LRM
31
1
0
21 Aug 2024
Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications
Till Speicher
Mohammad Aflah Khan
Qinyuan Wu
Vedant Nanda
Soumi Das
Bishwamittra Ghosh
Krishna P. Gummadi
Evimaria Terzi
46
3
0
27 Jul 2024
LLM Circuit Analyses Are Consistent Across Training and Scale
Curt Tigges
Michael Hanna
Qinan Yu
Stella Biderman
39
10
0
15 Jul 2024
A Comprehensive Survey on the Security of Smart Grid: Challenges, Mitigations, and Future Research Opportunities
Arastoo Zibaeirad
Farnoosh Koleini
Shengping Bi
Tao Hou
Tao Wang
AAML
44
14
0
10 Jul 2024
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
89
5
0
09 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
74
6
0
02 Jul 2024
DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
Zhehao Zhang
Jiaao Chen
Diyi Yang
LRM
37
8
0
25 Jun 2024
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
USVSN Sai Prashanth
Alvin Deng
Kyle O'Brien
Jyothir S V
Mohammad Aflah Khan
...
Jacob Ray Fuehne
Stella Biderman
Tracy Ke
Katherine Lee
Naomi Saphra
60
12
0
25 Jun 2024
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Frontier AI Models
Sunny Duan
Mikail Khona
Abhiram Iyer
Rylan Schaeffer
Ila R Fiete
53
3
0
20 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
32
0
17 Jun 2024
Measuring memorization in RLHF for code completion
Aneesh Pappu
Billy Porter
Ilia Shumailov
Jamie Hayes
33
0
0
17 Jun 2024
Do Parameters Reveal More than Loss for Membership Inference?
Anshuman Suri
Xiao Zhang
David E. Evans
MIACV
MIALM
AAML
52
1
0
17 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
36
120
0
12 Jun 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
HILM
42
2
0
11 Jun 2024
The Mosaic Memory of Large Language Models
Igor Shilov
Matthieu Meeus
Yves-Alexandre de Montjoye
47
3
0
24 May 2024
Large language models can be zero-shot anomaly detectors for time series?
Sarah Alnegheimish
Linh Nguyen
Laure Berti-Equille
K. Veeramachaneni
AI4TS
25
12
0
23 May 2024
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Xueyan Niu
Bo Bai
Lei Deng
Wei Han
39
6
0
14 May 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
67
0
0
13 May 2024
Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models
Yang Bai
Ge Pei
Jindong Gu
Yong Yang
Xingjun Ma
31
10
0
09 May 2024
In-Context Learning with Long-Context Models: An In-Depth Exploration
Amanda Bertsch
Maor Ivgi
Uri Alon
Jonathan Berant
Matthew R. Gormley
Matthew R. Gormley
Graham Neubig
ReLM
AIMat
93
64
0
30 Apr 2024
Rho-1: Not All Tokens Are What You Need
Zheng-Wen Lin
Zhibin Gou
Yeyun Gong
Xiao Liu
Yelong Shen
...
Chen Lin
Yujiu Yang
Jian Jiao
Nan Duan
Weizhu Chen
CLL
50
55
0
11 Apr 2024
Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models
Sebastian Bordt
Harsha Nori
Vanessa Rodrigues
Besmira Nushi
Rich Caruana
38
12
0
09 Apr 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
93
9
0
05 Apr 2024
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
Shengyi Huang
Michael Noukhovitch
Arian Hosseini
Kashif Rasul
Weixun Wang
Lewis Tunstall
VLM
30
31
0
24 Mar 2024
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
Yu Yang
Siddhartha Mishra
Jeffrey N Chiang
Baharan Mirzasoleiman
40
17
0
12 Mar 2024
1
2
Next