Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2510.07575
Cited By
v1
v2 (latest)
Benchmarking is Broken -- Don't Let AI be its Own Judge
8 October 2025
Zerui Cheng
Stella Wohnig
Ruchika Gupta
Samiul Alam
Tassallah Abdullahi
João Alves Ribeiro
Christian Nielsen-Garcia
Saif Mir
Siran Li
Jason Orender
Seyed Ali Bahrainian
Daniel Kirste
Aaron Gokaslan
Mikołaj Glinka
Carsten Eickhoff
Ruben Wolff
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Benchmarking is Broken -- Don't Let AI be its Own Judge"
0 / 0 papers shown
Title
No papers found