Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.03128
Cited By
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
4 October 2023
Yue Huang
Jiawen Shi
Yuan Li
Chenrui Fan
Siyuan Wu
Qihui Zhang
Yixin Liu
Pan Zhou
Yao Wan
Neil Zhenqiang Gong
Lichao Sun
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use"
29 / 29 papers shown
Title
ToolSpectrum : Towards Personalized Tool Utilization for Large Language Models
Zihao Cheng
Hongru Wang
Zeming Liu
Yuhang Guo
Yuanfang Guo
Yunhong Wang
Haifeng Wang
82
0
0
19 May 2025
OMAC: A Broad Optimization Framework for LLM-Based Multi-Agent Collaboration
Shijun Li
Hilaf Hasson
Joydeep Ghosh
LLMAG
78
0
0
17 May 2025
Multi-Mission Tool Bench: Assessing the Robustness of LLM based Agents through Related and Dynamic Missions
Peijie Yu
Yifan Yang
Jiajian Li
Zelong Zhang
Haorui Wang
Xiao Feng
Feng Zhang
LLMAG
166
2
0
03 Apr 2025
SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian
Emre Can Acikgoz
H. Wang
Xiusi Chen
Avirup Sil
Dilek Hakkani-Tur
Gokhan Tur
Heng Ji
LLMAG
KELM
LRM
116
8
0
17 Feb 2025
Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond
Kehan Guo
Yili Shen
Gisela Abigail Gonzalez-Montiel
Yue Huang
Yujun Zhou
...
Zhichun Guo
Prayel Das
Nitesh Chawla
Olaf Wiest
Wei Wei
157
2
0
14 Feb 2025
Preference Leakage: A Contamination Problem in LLM-as-a-judge
Dawei Li
Renliang Sun
Yue Huang
Ming Zhong
Bohan Jiang
Jiawei Han
Wei Wei
Wei Wang
Huan Liu
124
29
0
03 Feb 2025
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
Junjie Ye
Zhengyin Du
Xuesong Yao
Weijian Lin
Yufei Xu
...
Siyu Yuan
Tao Gui
Qi Zhang
Xuanjing Huang
Jiecao Chen
92
0
0
05 Jan 2025
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
Weiwei Sun
Lingyong Yan
Xinyu Ma
Shuaiqiang Wang
Pengjie Ren
Zhumin Chen
Dawei Yin
Zhaochun Ren
RALM
ALM
ELM
LRM
LM&MA
179
308
0
31 Dec 2024
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Bohan Lyu
Yadi Cao
Duncan Watson-Parris
Leon Bergen
Taylor Berg-Kirkpatrick
Rose Yu
113
4
0
01 Nov 2024
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling
Yakun Zhu
Shaohang Wei
Xu Wang
Kui Xue
Xiaofan Zhang
Shanghang Zhang
96
2
0
17 Oct 2024
Learning Evolving Tools for Large Language Models
Guoxin Chen
Zhong Zhang
Xin Cong
Fangda Guo
Yesai Wu
Yankai Lin
Wenzheng Feng
Yasheng Wang
KELM
78
2
0
09 Oct 2024
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
Haiyang Shen
Yue Li
Desong Meng
Dongqi Cai
Sheng Qi
Li Zhang
Mengwei Xu
Yudong Han
LLMAG
89
12
0
28 Jun 2024
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Zhicheng Guo
Sijie Cheng
Hao Wang
Shihao Liang
Yujia Qin
Peng Li
Zhiyuan Liu
Maosong Sun
Yang Liu
ELM
115
28
0
12 Mar 2024
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou
Yufei Han
Haomin Zhuang
Kehan Guo
Zhenwen Liang
Hongyan Bao
Xiangliang Zhang
LLMAG
AAML
70
14
0
20 Feb 2024
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
Chenyu Wang
Weixin Luo
Qianyu Chen
Haonan Mai
Jindi Guo
Sixun Dong
Xiaohua Xuan
MLLM
LLMAG
102
18
0
19 Jan 2024
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Zhengqing Yuan
Zhaoxu Li
Weiran Huang
Yanfang Ye
Lichao Sun
47
51
0
28 Dec 2023
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
165
743
0
19 Sep 2023
Simple synthetic data reduces sycophancy in large language models
Jerry W. Wei
Da Huang
Yifeng Lu
Denny Zhou
Quoc V. Le
79
73
0
07 Aug 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
98
43
0
01 Aug 2023
ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation
Guangyu Wang
Guoxing Yang
Zongxin Du
Longjun Fan
Xiaohu Li
LM&MA
ELM
AI4MH
66
84
0
16 Jun 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
87
75
0
25 May 2023
Gorilla: Large Language Model Connected with Massive APIs
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
84
552
0
24 May 2023
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
Jinyang Li
Binyuan Hui
Ge Qu
Jiaxi Yang
Binhua Li
...
Guoliang Li
Kevin C. C. Chang
Fei Huang
Reynold Cheng
Yongbin Li
LMTD
96
407
0
04 May 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
91
254
0
07 Apr 2023
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Yaobo Liang
Chenfei Wu
Ting Song
Wenshan Wu
Yan Xia
...
Shaoguang Mao
Yuntao Wang
Linjun Shou
Ming Gong
Nan Duan
LLMAG
CLL
65
199
0
29 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
73
535
0
07 Mar 2023
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Antonia Creswell
Murray Shanahan
I. Higgins
ReLM
LRM
94
361
0
19 May 2022
Internet-augmented language models through few-shot prompting for open-domain question answering
Angeliki Lazaridou
E. Gribovskaya
Wojciech Stokowiec
N. Grigorev
KELM
LRM
47
135
0
10 Mar 2022
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
140
1,727
0
02 Nov 2018
1