Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.21972
Cited By
Judging LLMs on a Simplex
28 May 2025
Patrick Vossler
Fan Xia
Yifan Mai
Jean Feng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Judging LLMs on a Simplex"
10 / 10 papers shown
Title
Validating LLM-as-a-Judge Systems in the Absence of Gold Labels
Luke M. Guerdan
Solon Barocas
Kenneth Holstein
Hanna M. Wallach
Zhiwei Steven Wu
Alexandra Chouldechova
ALM
ELM
355
1
0
13 Mar 2025
No Free Labels: Limitations of LLM-as-a-Judge Without Human Grounding
Michael Krumdick
Charles Lovering
Varshini Reddy
Seth Ebner
Chris Tanner
ALM
ELM
87
3
0
07 Mar 2025
A Statistical Framework for Ranking LLM-Based Chatbots
Siavash Ameli
Siyuan Zhuang
Ion Stoica
Michael W. Mahoney
ELM
63
2
0
24 Dec 2024
Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations
Evan Miller
ELM
43
22
0
01 Nov 2024
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
Bofei Gao
Feifan Song
Zhiyong Yang
Zefan Cai
Yibo Miao
...
Lei Sha
Yichang Zhang
Xuancheng Ren
Tianyu Liu
Baobao Chang
ELM
LRM
51
50
0
10 Oct 2024
Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates
Hui Wei
Shenghua He
Tian Xia
Andy H. Wong
Jingyang Lin
Mei Han
Mei Han
ALM
ELM
75
26
0
23 Aug 2024
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Pat Verga
Sebastian Hofstatter
Sophia Althammer
Yixuan Su
Aleksandra Piktus
Arkady Arkhangorodsky
Minjie Xu
Naomi White
Patrick Lewis
ALM
ELM
57
96
0
29 Apr 2024
Benchmarking and Improving Generator-Validator Consistency of Language Models
Xiang Lisa Li
Vaishnavi Shrivastava
Siyan Li
Tatsunori Hashimoto
Percy Liang
26
29
0
03 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
167
4,085
0
09 Jun 2023
Prediction-Powered Inference
Anastasios Nikolas Angelopoulos
Stephen Bates
Clara Fannjiang
Michael I. Jordan
Tijana Zrnic
49
95
0
23 Jan 2023
1