ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.00132
  4. Cited By
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
v1v2v3 (latest)

ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

28 June 2024
Haiyang Shen
Yue Li
Desong Meng
Dongqi Cai
Sheng Qi
Li Zhang
Mengwei Xu
Yudong Han
    LLMAG
ArXiv (abs)PDFHTML

Papers citing "ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents"

31 / 31 papers shown
Title
AI Scientists Fail Without Strong Implementation Capability
AI Scientists Fail Without Strong Implementation Capability
Minjun Zhu
Qiujie Xie
Yixuan Weng
Jian Wu
Zhen Lin
Linyi Yang
Yue Zhang
ELM
102
0
0
02 Jun 2025
FamilyTool: A Multi-hop Personalized Tool Use Benchmark
FamilyTool: A Multi-hop Personalized Tool Use Benchmark
Yuxin Wang
Yiran Guo
Y. Zheng
Zhangyue Yin
Tian Jin
Jie Yang
Jiajun Chen
Yuan Li
Xuanjing Huang
Xipeng Qiu
94
0
0
09 Apr 2025
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Peijie Yu
Yifan Yang
Jiajian Li
Zelong Zhang
Haorui Wang
Xiao Feng
Feng Zhang
LLMAG
225
2
0
03 Apr 2025
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents
Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents
Shuo Ren
Pu Jian
Zhenjiang Ren
Chunlin Leng
Can Xie
Jiajun Zhang
LLMAGAI4CE
182
4
0
31 Mar 2025
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs
Zhicheng Guo
Sijie Cheng
Yuchen Niu
Hao Wang
Sicheng Zhou
Wenbing Huang
Yang Liu
CLLOffRL
221
0
0
26 Mar 2025
PEToolLLM: Towards Personalized Tool Learning in Large Language Models
Qiancheng Xu
Yunshui Li
Heming Xia
Fan Liu
Min Yang
Wenjie Li
165
0
0
26 Feb 2025
Standard Benchmarks Fail - Auditing LLM Agents in Finance Must Prioritize Risk
Standard Benchmarks Fail - Auditing LLM Agents in Finance Must Prioritize Risk
Zichen Chen
Jiaao Chen
Jianda Chen
Misha Sra
ELM
169
1
0
21 Feb 2025
Beyond Browsing: API-Based Web Agents
Beyond Browsing: API-Based Web Agents
Yueqi Song
Frank F. Xu
Shuyan Zhou
Graham Neubig
172
23
0
21 Oct 2024
NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls
NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls
Kinjal Basu
Ibrahim Abdelaziz
Kiran Kate
Mayank Agarwal
Maxwell Crouse
...
Sadhana Kumaravel
Saurabh Goyal
Xin Wang
Luis A. Lastras
Pavan Kapanipathi
103
11
0
04 Sep 2024
Tool Learning with Large Language Models: A Survey
Tool Learning with Large Language Models: A Survey
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jun Xu
Jirong Wen
LLMAG
105
107
0
28 May 2024
Tell Me More! Towards Implicit User Intention Understanding of Language
  Model Driven Agents
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents
Cheng Qian
Bingxiang He
Zhuang Zhong
Jia Deng
Yujia Qin
...
Zhong Zhang
Jie Zhou
Yankai Lin
Zhiyuan Liu
Maosong Sun
74
36
0
14 Feb 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DVRALM
350
1,846
1
18 Dec 2023
KwaiAgents: Generalized Information-seeking Agent System with Large
  Language Models
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
Haojie Pan
Zepeng Zhai
Hao Yuan
Yaojia Lv
Ruiji Fu
Ming Liu
Zhongyuan Wang
Bing Qin
LLMAGRALM
83
12
0
08 Dec 2023
MetaTool Benchmark for Large Language Models: Deciding Whether to Use
  Tools and Which to Use
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Yue Huang
Jiawen Shi
Yuan Li
Chenrui Fan
Siyuan Wu
...
Yixin Liu
Pan Zhou
Yao Wan
Neil Zhenqiang Gong
Lichao Sun
LLMAG
129
96
0
04 Oct 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi
Wenxiang Chen
Xin Guo
Wei He
Yiwen Ding
...
Wenjuan Qin
Yongyan Zheng
Xipeng Qiu
Xuanjing Huan
Tao Gui
LM&MALM&Ro3DVAI4CE
200
959
0
14 Sep 2023
A Survey on Large Language Model based Autonomous Agents
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAGAI4CELM&Ro
227
1,333
0
22 Aug 2023
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
  APIs
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Yujia Qin
Shi Liang
Yining Ye
Kunlun Zhu
Lan Yan
...
Jie Zhou
Mark B. Gerstein
Dahai Li
Zhiyuan Liu
Maosong Sun
CLLALMLLMAGELMLM&MA
236
712
0
31 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
270
630
0
12 Jul 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
ToolQA: A Dataset for LLM Question Answering with External Tools
Yuchen Zhuang
Yue Yu
Kuan-Chieh Wang
Haotian Sun
Chao Zhang
ELMLLMAG
103
252
0
23 Jun 2023
ToolAlpaca: Generalized Tool Learning for Language Models with 3000
  Simulated Cases
ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases
Qiaoyu Tang
Ziliang Deng
Hongyu Lin
Xianpei Han
Qiao Liang
Boxi Cao
Le Sun
CLLSyDa
139
202
0
08 Jun 2023
On the Tool Manipulation Capability of Open-source Large Language Models
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
110
78
0
25 May 2023
Ghost in the Minecraft: Generally Capable Agents for Open-World
  Environments via Large Language Models with Text-based Knowledge and Memory
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
Xizhou Zhu
Yuntao Chen
Hao Tian
Chenxin Tao
Weijie Su
...
Lewei Lu
Xiaogang Wang
Yu Qiao
Zhaoxiang Zhang
Jifeng Dai
LLMAGLM&Ro
121
240
0
25 May 2023
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
Guanzhi Wang
Yuqi Xie
Yunfan Jiang
Ajay Mandlekar
Chaowei Xiao
Yuke Zhu
Linxi Fan
Anima Anandkumar
LM&RoSyDa
191
844
0
25 May 2023
Gorilla: Large Language Model Connected with Massive APIs
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELMCLLALMSyDa
196
572
0
24 May 2023
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via
  Tool Embeddings
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Shibo Hao
Tianyang Liu
Zhen Wang
Zhiting Hu
RALMLLMAG
163
183
0
19 May 2023
Can LLM Already Serve as A Database Interface? A BIg Bench for
  Large-Scale Database Grounded Text-to-SQLs
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
Jinyang Li
Binyuan Hui
Ge Qu
Jiaxi Yang
Binhua Li
...
Guoliang Li
Kevin C. C. Chang
Fei Huang
Reynold Cheng
Yongbin Li
LMTD
186
422
0
04 May 2023
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging
  Face
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
Yongliang Shen
Kaitao Song
Xu Tan
Dongsheng Li
Weiming Lu
Yueting Zhuang
MLLM
178
913
0
30 Mar 2023
Toolformer: Language Models Can Teach Themselves to Use Tools
Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick
Jane Dwivedi-Yu
Roberto Dessì
Roberta Raileanu
Maria Lomeli
Luke Zettlemoyer
Nicola Cancedda
Thomas Scialom
SyDaRALM
231
1,781
0
09 Feb 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAGReLMLRM
525
3,007
0
06 Oct 2022
From Robotic Process Automation to Intelligent Process Automation:
  Emerging Trends
From Robotic Process Automation to Intelligent Process Automation: Emerging Trends
Tathagata Chakraborti
Vatche Isahagian
Rania Y. Khalaf
Y. Khazaeni
Vinod Muthusamy
Sadhana Kumaravel
Merve Unuvar
AI4CE
56
46
0
27 Jul 2020
IFTTT vs. Zapier: A Comparative Study of Trigger-Action Programming
  Frameworks
IFTTT vs. Zapier: A Comparative Study of Trigger-Action Programming Frameworks
Amir Rahmati
Earlence Fernandes
Jaeyeon Jung
A. Prakash
43
35
0
08 Sep 2017
1