Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.01833
Cited By
YourBench: Easy Custom Evaluation Sets for Everyone
2 April 2025
Shivalika Singh
Clémentine Fourrier
Alina Lozovskia
Thomas Wolf
Gokhan Tur
Dilek Hakkani-Tur
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"YourBench: Easy Custom Evaluation Sets for Everyone"
4 / 4 papers shown
Title
Know Or Not: a library for evaluating out-of-knowledge base robustness
Jessica Foo
Pradyumna Shyama Prasad
Shaun Khoo
67
0
0
19 May 2025
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
Joshua Harris
Fan Grayson
Felix Feldman
Timothy Laurence
Toby Nonnenmacher
...
Leo Loman
Selina Patel
Thomas Finnie
Samuel Collins
Michael Borowitz
AI4MH
LM&MA
ELM
141
0
0
09 May 2025
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Satyapriya Krishna
Kalpesh Krishna
Anhad Mohananey
Steven Schwarcz
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
ReLM
3DV
LRM
RALM
99
30
0
28 Jan 2025
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
154
9
1
10 Jul 2024
1