ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.22584
  4. Cited By
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

29 October 2024
Natasha Butt
Varun Chandrasekaran
Neel Joshi
Besmira Nushi
Vidhisha Balachandran
ArXivPDFHTML

Papers citing "BENCHAGENTS: Automated Benchmark Creation with Agent Interaction"

4 / 4 papers shown
Title
Phi-4-reasoning Technical Report
Phi-4-reasoning Technical Report
Marah Abdin
Sahaj Agarwal
Ahmed Hassan Awadallah
Vidhisha Balachandran
Harkirat Singh Behl
...
Vaishnavi Shrivastava
Vibhav Vineet
Yue Wu
Safoora Yousefi
Guoqing Zheng
ReLM
LRM
90
1
0
30 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
75
0
0
01 Apr 2025
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Vidhisha Balachandran
Jingya Chen
Lingjiao Chen
Shivam Garg
Neel Joshi
...
John Langford
Besmira Nushi
Vibhav Vineet
Yue Wu
Safoora Yousefi
ReLM
LRM
59
3
0
31 Mar 2025
Multi-agent Architecture Search via Agentic Supernet
Multi-agent Architecture Search via Agentic Supernet
Guibin Zhang
Luyang Niu
Junfeng Fang
Kaidi Wang
Lei Bai
Xinyu Wang
102
3
0
06 Feb 2025
1