Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.00118
Cited By
v1
v2 (latest)
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
28 April 2023
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"
26 / 26 papers shown
Title
Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes
L. Lucy
Camilla Griffiths
Sarah Levine
Jennifer L. Eberhardt
Dorottya Demszky
David Bamman
73
0
0
29 May 2025
Multimodal Conversation Structure Understanding
Kent K. Chang
Mackenzie Cramer
Anna Ho
Ti Ti Nguyen
Yilin Yuan
David Bamman
74
0
0
23 May 2025
Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models
Hwiyeong Lee
Uiji Hwang
Hyelim Lim
Taeuk Kim
MU
108
1
0
22 May 2025
Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks
Yixuan Xu
Antoni-Joan Solergibert i Llaquet
Antoine Bosselut
Imanol Schlag
94
0
0
19 May 2025
The Hitchhikers Guide to Production-ready Trustworthy Foundation Model powered Software (FMware)
Kirill Vasilevski
Benjamin Rombaut
Gopi Krishnan Rajbahadur
G. Oliva
Keheliya Gallaba
...
Haoxiang Zhang
Bouyan Chen
Kishanthan Thangarajah
Ahmed E. Hassan
Zhen Ming
115
0
0
15 May 2025
DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs
Tamim Al Mahmud
N. Jebreel
Josep Domingo-Ferrer
David Sánchez
MU
75
0
0
18 Apr 2025
Memorization: A Close Look at Books
Iris Ma
Ian Domingo
A. Krone-Martins
Pierre Baldi
Cristina V. Lopes
112
0
0
17 Apr 2025
Benchmarking Large Language Models for Handwritten Text Recognition
Giorgia Crosilla
Lukas Klic
Giovanni Colavizza
114
0
0
19 Mar 2025
Learning on LLM Output Signatures for gray-box Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
152
0
0
18 Mar 2025
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar
Matthew Jagielski
Katherine Lee
Niloofar Mireshghallah
David A. Smith
Christopher A. Choquette-Choo
PILM
241
2
0
21 Feb 2025
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu
Özlem Uzuner
Meliha Yetisgen
Fei Xia
120
8
0
24 Oct 2024
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Shuchao Pang
Zhigang Lu
Haoran Wang
Peng Fu
Yongbin Zhou
Minhui Xue
AAML
146
5
0
16 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
280
4
2
10 Oct 2024
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang
Yiwei Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
161
6
0
05 Sep 2024
Evaluating Copyright Takedown Methods for Language Models
Boyi Wei
Weijia Shi
Yangsibo Huang
Noah A. Smith
Chiyuan Zhang
Luke Zettlemoyer
Kai Li
Peter Henderson
160
25
0
26 Jun 2024
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Tomer Ashuach
Martin Tutek
Yonatan Belinkov
MU
KELM
199
7
0
13 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
107
56
0
06 Jun 2024
Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Sneha Singhania
Simon Razniewski
Gerhard Weikum
RALM
131
1
0
04 May 2024
Offset Unlearning for Large Language Models
James Y. Huang
Wenxuan Zhou
Fei Wang
Fred Morstatter
Sheng Zhang
Hoifung Poon
Muhao Chen
MU
110
17
0
17 Apr 2024
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Cunxiang Wang
Ruoxi Ning
Boqi Pan
Tonghui Wu
Qipeng Guo
...
Guangsheng Bao
Xiangkun Hu
Zheng Zhang
Qian Wang
Yue Zhang
RALM
241
11
0
18 Mar 2024
Do LLMs Dream of Ontologies?
Marco Bombieri
Paolo Fiorini
Simone Paolo Ponzetto
M. Rospocher
CLL
104
3
0
26 Jan 2024
People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection
Indira Sen
Dennis Assenmacher
Mattia Samory
Isabelle Augenstein
Wil M.P. van der Aalst
Claudia Wagner
100
21
0
02 Nov 2023
Detecting Pretraining Data from Large Language Models
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
MIALM
126
204
0
25 Oct 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
114
68
0
08 Aug 2023
Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence
John J. Nay
David Karamardian
Sarah Lawsky
Wenting Tao
Meghana Moorthy Bhat
Raghav Jain
Aaron Travis Lee
Jonathan H. Choi
Jungo Kasai
ELM
AILaw
118
60
0
12 Jun 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
101
107
0
17 May 2023
1