Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.10677
Cited By
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation
19 September 2023
Yucheng Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation"
9 / 9 papers shown
Title
Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation
Shuo Tang
Xianghe Pang
Zexi Liu
Bohan Tang
Guangyi Liu
Xiaowen Dong
Yunhong Wang
Yanfeng Wang
Tian Jin
SyDa
LLMAG
135
4
0
21 Feb 2025
Are Large Language Models Memorizing Bug Benchmarks?
Daniel Ramos
Claudia Mamede
Kush Jain
Paulo Canelas
Catarina Gamboa
Claire Le Goues
PILM
ELM
97
6
0
20 Nov 2024
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu
Özlem Uzuner
Meliha Yetisgen
Fei Xia
67
4
0
24 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Han Zhang
Songxin Zhang
Bingyi Jing
Hongxin Wei
43
1
0
09 Oct 2024
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models
Huixuan Zhang
Yun Lin
Xiaojun Wan
53
0
0
26 Jun 2024
Benchmark Data Contamination of Large Language Models: A Survey
Cheng Xu
Shuhao Guan
Derek Greene
Mohand-Tahar Kechadi
ELM
ALM
40
41
0
06 Jun 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
Renren Jin
Jiangcun Du
Wuwei Huang
Wei Liu
Jian Luan
Bin Wang
Deyi Xiong
MQ
32
31
0
26 Feb 2024
Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Shuo Yang
Wei-Lin Chiang
Lianmin Zheng
Joseph E. Gonzalez
Ion Stoica
ALM
27
112
0
08 Nov 2023
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,831
0
14 Dec 2020
1