Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.21972
Cited By
Judging LLMs on a Simplex
28 May 2025
Patrick Vossler
Fan Xia
Yifan Mai
Jean Feng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Judging LLMs on a Simplex"
4 / 4 papers shown
Title
A Statistical Framework for Ranking LLM-Based Chatbots
Siavash Ameli
Siyuan Zhuang
Ion Stoica
Michael W. Mahoney
ELM
63
2
0
24 Dec 2024
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
Bofei Gao
Feifan Song
Zhiyong Yang
Zefan Cai
Yibo Miao
...
Lei Sha
Yichang Zhang
Xuancheng Ren
Tianyu Liu
Baobao Chang
ELM
LRM
47
50
0
10 Oct 2024
Benchmarking and Improving Generator-Validator Consistency of Language Models
Xiang Lisa Li
Vaishnavi Shrivastava
Siyan Li
Tatsunori Hashimoto
Percy Liang
26
29
0
03 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
136
4,085
0
09 Jun 2023
1