Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.22584
Cited By
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
29 October 2024
Natasha Butt
Varun Chandrasekaran
Neel Joshi
Besmira Nushi
Vidhisha Balachandran
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BENCHAGENTS: Automated Benchmark Creation with Agent Interaction"
4 / 4 papers shown
Title
Phi-4-reasoning Technical Report
Marah Abdin
Sahaj Agarwal
Ahmed Hassan Awadallah
Vidhisha Balachandran
Harkirat Singh Behl
...
Vaishnavi Shrivastava
Vibhav Vineet
Yue Wu
Safoora Yousefi
Guoqing Zheng
ReLM
LRM
90
1
0
30 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Vidhisha Balachandran
Jingya Chen
Lingjiao Chen
Shivam Garg
Neel Joshi
...
John Langford
Besmira Nushi
Vibhav Vineet
Yue Wu
Safoora Yousefi
ReLM
LRM
59
3
0
31 Mar 2025
Multi-agent Architecture Search via Agentic Supernet
Guibin Zhang
Luyang Niu
Junfeng Fang
Kaidi Wang
Lei Bai
Xinyu Wang
102
3
0
06 Feb 2025
1