Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15334
Cited By
Gorilla: Large Language Model Connected with Massive APIs
24 May 2023
Shishir G. Patil
Tianjun Zhang
Xin Wang
Joseph E. Gonzalez
ELM
CLL
ALM
SyDa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gorilla: Large Language Model Connected with Massive APIs"
50 / 392 papers shown
Title
Benchmarking Failures in Tool-Augmented Language Models
Eduardo Treviño
Hugo Contant
James Ngai
Graham Neubig
Zora Z. Wang
72
0
0
18 Mar 2025
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri
Melissa Z. Pan
Shuyi Yang
Lakshya A Agrawal
Bhavya Chopra
...
Dan Klein
Kannan Ramchandran
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
LLMAG
Presented at
ResearchTrend Connect | LLMAG
on
23 Apr 2025
131
11
0
17 Mar 2025
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
Bowen Baker
Joost Huizinga
Leo Gao
Zehao Dou
M. Guan
Aleksander Mądry
Wojciech Zaremba
J. Pachocki
David Farhi
LRM
77
13
0
14 Mar 2025
Attacking Multimodal OS Agents with Malicious Image Patches
Lukas Aichberger
Alasdair Paren
Y. Gal
Philip Torr
Adel Bibi
AAML
62
4
0
13 Mar 2025
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Arman Zharmagambetov
Chuan Guo
Ivan Evtimov
Maya Pavlova
Ruslan Salakhutdinov
Kamalika Chaudhuri
75
1
0
12 Mar 2025
Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation
Fan Yin
Zifeng Wang
I-Hung Hsu
Jun Yan
Ke Jiang
...
L. Le
Kai-Wei Chang
Chen-Yu Lee
Hamid Palangi
Tomas Pfister
60
4
0
10 Mar 2025
Queueing, Predictions, and LLMs: Challenges and Open Problems
Michael Mitzenmacher
Rana Shahout
AI4TS
LRM
44
1
0
10 Mar 2025
Alignment for Efficient Tool Calling of Large Language Models
Hongshen Xu
Zihan Wang
Zichen Zhu
Lei Pan
Xingyu Chen
Lu Chen
Kai Yu
49
0
0
09 Mar 2025
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
Roham Koohestani
Philippe de Bekker
M. Izadi
VLM
50
0
0
07 Mar 2025
MPO: Boosting LLM Agents with Meta Plan Optimization
Weimin Xiong
Yifan Song
Qingxiu Dong
Bingchan Zhao
Feifan Song
Xun Wang
Sujian Li
LLMAG
81
1
0
04 Mar 2025
ATLaS: Agent Tuning via Learning Critical Steps
Zhixun Chen
Ming Li
Yuanmin Huang
Yali Du
Meng Fang
Dinesh Manocha
83
3
0
04 Mar 2025
AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification
Xuan Zhang
Yongliang Shen
Zhe Zheng
Linjuan Wu
Wenqi Zhang
Yuchen Yan
Qiuying Peng
Jun Wang
Weiming Lu
KELM
85
1
0
03 Mar 2025
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
Jeonghoon Shim
Gyuhyeon Seo
Cheongsu Lim
Yohan Jo
49
4
0
01 Mar 2025
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation
Albert Gong
Kamilė Stankevičiūtė
Chao-gang Wan
Anmol Kabra
Raphael Thesmar
Johann Lee
Julius Klenke
Carla P. Gomes
Kilian Q. Weinberger
RALM
LRM
62
0
0
27 Feb 2025
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
Jie He
Jennifer Neville
Mengting Wan
Longqi Yang
Hui Liu
Xiaofeng Xu
Xia Song
Jeff Z. Pan
Pei Zhou
LLMAG
SyDa
63
0
0
26 Feb 2025
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz
Jeremiah Greer
Akul Datta
Ze Yang
William Zeng
Oussama Elachqar
Emmanouil Koukoumidis
Dilek Hakkani-Tur
Gokhan Tur
LLMAG
108
3
0
20 Feb 2025
Autellix: An Efficient Serving Engine for LLM Agents as General Programs
Michael Luo
Xiaoxiang Shi
Colin Cai
Tianjun Zhang
Justin Wong
...
Chi Wang
Yanping Huang
Zhifeng Chen
Joseph E. Gonzalez
Ion Stoica
55
3
0
20 Feb 2025
A Survey on LLM-powered Agents for Recommender Systems
Qiyao Peng
Hongtao Liu
Hua Huang
Qing Yang
Minglai Shao
LLMAG
LRM
86
2
0
14 Feb 2025
Self-Training Large Language Models for Tool-Use Without Demonstrations
Ne Luo
Aryo Pradipta Gema
Xuanli He
Emile van Krieken
Pietro Lesci
Pasquale Minervini
LLMAG
82
1
0
09 Feb 2025
SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task
Ziije Zhong
Linqing Zhong
Zhaoze Sun
Qingyun Jin
Zengchang Qin
Xiaofan Zhang
63
7
0
28 Jan 2025
How Should We Build A Benchmark? Revisiting 274 Code-Related Benchmarks For LLMs
Jialun Cao
Yuk-Kit Chan
Zixuan Ling
Wenxuan Wang
Shuqing Li
...
Pinjia He
Shuai Wang
Zibin Zheng
Michael R. Lyu
Shing-Chi Cheung
ALM
71
1
0
18 Jan 2025
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Aidan Hogan
Xin Luna Dong
Denny Vrandečić
Gerhard Weikum
57
2
0
12 Jan 2025
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
Khanh-Tung Tran
Dung Dao
Minh-Duong Nguyen
Quoc-Viet Pham
Barry O'Sullivan
Hoang D. Nguyen
LLMAG
103
29
0
10 Jan 2025
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
Han Han
Tong Zhu
Xiang Zhang
Mengsong Wu
Hao Xiong
Wenliang Chen
38
0
0
08 Jan 2025
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
Junjie Ye
Zhengyin Du
Xuesong Yao
Weijian Lin
Yufei Xu
...
Siyu Yuan
Tao Gui
Qi Zhang
Xuanjing Huang
Jiecao Chen
59
0
0
05 Jan 2025
Optimizing Small Language Models for In-Vehicle Function-Calling
Yahya Sowti Khiabani
Farris Atif
Chieh Hsu
Sven Stahlmann
Tobias Michels
Sebastian Kramer
Benedikt Heidrich
M. Saquib Sarfraz
Julian Merten
Faezeh Tafazzoli
34
1
0
04 Jan 2025
Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
Hiroki Furuta
Yutaka Matsuo
Aleksandra Faust
Izzeddin Gur
CLL
95
14
0
03 Jan 2025
Plancraft: an evaluation dataset for planning with LLM agents
Gautier Dagan
Frank Keller
A. Lascarides
LLMAG
39
0
0
31 Dec 2024
The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents
Feiran Jia
Tong Wu
Xin Qin
Anna Squicciarini
LLMAG
AAML
98
4
0
21 Dec 2024
Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning
Ziang Ye
Zizhuo Zhang
Yang Zhang
Jianxin Ma
Junyang Lin
Fuli Feng
LRM
85
0
0
19 Dec 2024
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
Dimitrios Mallis
Ahmet Serdar Karadeniz
Sebastian Cavada
Danila Rukhovich
Niki Maria Foteinopoulou
K. Cherenkova
Anis Kacem
Djamila Aouada
79
3
0
18 Dec 2024
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F. Xu
Yufan Song
Boxuan Li
Yuxuan Tang
Kritanjali Jain
...
Wayne Chi
Lawrence Jang
Yiqing Xie
Shuyan Zhou
Graham Neubig
LLMAG
134
22
0
18 Dec 2024
Empowering LLMs to Understand and Generate Complex Vector Graphics
Ximing Xing
Juncheng Hu
Guotao Liang
Jing Zhang
Dong Xu
Qian Yu
97
7
0
15 Dec 2024
GraphTool-Instruction: Revolutionizing Graph Reasoning in LLMs through Decomposed Subtask Instruction
Rongzheng Wang
Shuang Liang
Qizhi Chen
Jiasheng Zhang
Ke Qin
92
0
0
11 Dec 2024
LABIIUM: AI-Enhanced Zero-configuration Measurement Automation System
Emmanuel A. Olowe
Danial Chitnis
80
0
0
07 Dec 2024
Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation
Robin D. Pesl
Jerin G. Mathew
Massimo Mecella
Marco Aiello
78
1
0
29 Nov 2024
Action Engine: An LLM-based Framework for Automatic FaaS Workflow Generation
Akiharu Esashi
Pawissanutt Lertpongrujikorn
M. Salehi
79
0
0
29 Nov 2024
MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification
Saptarshi Sengupta
Kristal Curtis
Akshay Mallipeddi
Abhinav Mathur
Joseph Ross
Liang Gou
Liang Gou
LLMAG
SyDa
135
1
0
28 Nov 2024
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
Yufei Guo
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
234
0
0
25 Nov 2024
SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
Shirley Kokane
Ming Zhu
Tulika Awalgaonkar
Jianguo Zhang
Thai Hoang
...
Juan Carlos Niebles
Huan Wang
Shelby Heinecke
Caiming Xiong
Silivo Savarese
LLMAG
109
1
0
20 Nov 2024
LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation
Sachit Kuhar
W. Ahmad
Zijian Wang
Nihal Jain
Haifeng Qian
Baishakhi Ray
M. K. Ramanathan
Xiaofei Ma
Anoop Deoras
ELM
71
1
0
19 Nov 2024
PTR: Precision-Driven Tool Recommendation for Large Language Models
Hang Gao
Yongfeng Zhang
KELM
46
0
0
14 Nov 2024
From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond
Harsha Nori
Naoto Usuyama
Nicholas King
S. McKinney
Xavier Fernandes
Sheng Zhang
Eric Horvitz
LRM
LM&MA
ELM
VLM
55
10
0
06 Nov 2024
EcoAct: Economic Agent Determines When to Register What Action
Shaokun Zhang
Jieyu Zhang
Dujian Ding
Mirian Hipolito Garcia
Ankur Mallick
Daniel Madrigal
Menglin Xia
Victor Rühle
Qingyun Wu
Chi Wang
LLMAG
53
4
0
03 Nov 2024
CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research
Sian-Yao Huang
Cheng-Lin Yang
Hongpeng Zhou
Chun-Ying Huang
35
2
0
02 Nov 2024
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Bohan Lyu
Yadi Cao
Duncan Watson-Parris
Leon Bergen
Taylor Berg-Kirkpatrick
Rose Yu
61
3
0
01 Nov 2024
FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Jiongxiao Wang
Fangzhou Wu
Wendi Li
Jinsheng Pan
Edward Suh
Zhuoqing Mao
Muhao Chen
Chaowei Xiao
AAML
40
6
0
28 Oct 2024
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks
Graziano A. Manduzio
Federico A. Galatolo
M. G. Cimino
Enzo Pasquale Scilingo
Lorenzo Cominelli
LRM
29
1
0
24 Oct 2024
PRACT: Optimizing Principled Reasoning and Acting of LLM Agent
Zhiwei Liu
Weiran Yao
Jianguo Zhang
Rithesh Murthy
Liangwei Yang
...
Juan Carlos Niebles
Shelby Heinecke
Huan Wang
Silvio Savarese
Caiming Xiong
31
0
0
24 Oct 2024
Beyond Browsing: API-Based Web Agents
Yueqi Song
Frank F. Xu
Shuyan Zhou
Graham Neubig
61
16
0
21 Oct 2024
Previous
1
2
3
4
5
6
7
8
Next