ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.11128
  4. Cited By
Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants'
  API Invocation Capabilities

Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities

17 March 2024
Honglin Mu
Yang Xu
Yunlong Feng
Xiaofeng Han
Yitong Li
Yutai Hou
Wanxiang Che
    ELM
ArXivPDFHTML

Papers citing "Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities"

6 / 6 papers shown
Title
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Peijie Yu
Yifan Yang
Jiajian Li
Zelong Zhang
Haorui Wang
Xiao Feng
Feng Zhang
LLMAG
155
2
0
03 Apr 2025
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
312
4,253
0
09 Jun 2023
Gorilla: Large Language Model Connected with Massive APIs
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
78
551
0
24 May 2023
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections
  and State Tracking Baselines
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
Xiaoxue Zang
Abhinav Rastogi
Srinivas Sunkara
Raghav Gupta
Jianguo Zhang
Jindong Chen
67
277
0
10 Jul 2020
CoQA: A Conversational Question Answering Challenge
CoQA: A Conversational Question Answering Challenge
Siva Reddy
Danqi Chen
Christopher D. Manning
RALM
HAI
98
1,201
0
21 Aug 2018
QuAC : Question Answering in Context
QuAC : Question Answering in Context
Eunsol Choi
He He
Mohit Iyyer
Mark Yatskar
Wen-tau Yih
Yejin Choi
Percy Liang
Luke Zettlemoyer
104
826
0
21 Aug 2018
1