ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.07137
  4. Cited By
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

9 October 2024
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Jing Jiang
Min-Bin Lin
ArXivPDFHTML

Papers citing "Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates"

6 / 6 papers shown
Title
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
Tianjian Li
Daniel Khashabi
55
0
0
05 May 2025
Do LLM Evaluators Prefer Themselves for a Reason?
Do LLM Evaluators Prefer Themselves for a Reason?
Wei-Lin Chen
Zhepei Wei
Xinyu Zhu
Shi Feng
Yu Meng
ELM
LRM
42
0
0
04 Apr 2025
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation
Junchen Fu
Xuri Ge
Kaiwen Zheng
Ioannis Arapakis
Xin Xin
J. Jose
87
1
0
20 Feb 2025
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Zhaoyi Zhou
Yuda Song
Andrea Zanette
ALM
73
0
0
14 Feb 2025
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Improving Your Model Ranking on Chatbot Arena by Vote Rigging
Rui Min
Tianyu Pang
Chao Du
Qian Liu
Minhao Cheng
Min-Bin Lin
AAML
57
3
0
29 Jan 2025
Keep Guessing? When Considering Inference Scaling, Mind the Baselines
Keep Guessing? When Considering Inference Scaling, Mind the Baselines
G. Yona
Or Honovich
Omer Levy
Roee Aharoni
UQLM
LRM
33
0
0
20 Oct 2024
1