Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.11128
Cited By
Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities
17 March 2024
Honglin Mu
Yang Xu
Yunlong Feng
Xiaofeng Han
Yitong Li
Yutai Hou
Wanxiang Che
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities"
6 / 6 papers shown
Title
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Peijie Yu
Yifan Yang
Jiajian Li
Zelong Zhang
Haorui Wang
Xiao Feng
Feng Zhang
LLMAG
155
2
0
03 Apr 2025
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
312
4,253
0
09 Jun 2023
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
78
551
0
24 May 2023
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
Xiaoxue Zang
Abhinav Rastogi
Srinivas Sunkara
Raghav Gupta
Jianguo Zhang
Jindong Chen
67
277
0
10 Jul 2020
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALM
HAI
98
1,201
0
21 Aug 2018
QuAC : Question Answering in Context
Eunsol Choi
He He
Mohit Iyyer
Mark Yatskar
Wen-tau Yih
Yejin Choi
Percy Liang
Luke Zettlemoyer
104
826
0
21 Aug 2018
1