Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.07137
Cited By
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
9 October 2024
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Jing Jiang
Min-Bin Lin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates"
6 / 6 papers shown
Title
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
55
0
0
05 May 2025
Do LLM Evaluators Prefer Themselves for a Reason?
Wei-Lin Chen
Zhepei Wei
Xinyu Zhu
Shi Feng
Yu Meng
ELM
LRM
42
0
0
04 Apr 2025
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation
Junchen Fu
Xuri Ge
Kaiwen Zheng
Ioannis Arapakis
Xin Xin
J. Jose
87
1
0
20 Feb 2025
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Zhaoyi Zhou
Yuda Song
Andrea Zanette
ALM
73
0
0
14 Feb 2025
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Rui Min
Tianyu Pang
Chao Du
Qian Liu
Minhao Cheng
Min-Bin Lin
AAML
57
3
0
29 Jan 2025
Keep Guessing? When Considering Inference Scaling, Mind the Baselines
G. Yona
Or Honovich
Omer Levy
Roee Aharoni
UQLM
LRM
33
0
0
20 Oct 2024
1