Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.01533
Cited By
v1
v2
v3 (latest)
Enhancing LLM Evaluations: The Garbling Trick
3 November 2024
William F. Bradley
LRM
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Enhancing LLM Evaluations: The Garbling Trick"
8 / 8 papers shown
Title
LLMs and the Madness of Crowds
William F. Bradley
41
2
0
03 Nov 2024
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
AIMat
LRM
146
186
0
07 Oct 2024
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Yubo Wang
Xueguang Ma
Ge Zhang
Yuansheng Ni
Abhranil Chandra
...
Kai Wang
Alex Zhuang
Rongqi Fan
Xiang Yue
Wenhu Chen
LRM
ELM
169
465
0
03 Jun 2024
Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding?
Yutong Hu
Quzhe Huang
Mingxu Tao
Chen Zhang
Yansong Feng
94
31
0
09 May 2024
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
425
4,609
0
27 Oct 2021
An Ensemble of Simple Convolutional Neural Network Models for MNIST Digit Recognition
Sanghyeon An
Min Jun Lee
Sanglee Park
H. Yang
Jungmin So
87
79
0
12 Aug 2020
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
214
2,537
0
19 May 2019
Know What You Don't Know: Unanswerable Questions for SQuAD
Pranav Rajpurkar
Robin Jia
Percy Liang
RALM
ELM
324
2,858
0
11 Jun 2018
1