ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.01030
  4. Cited By
Executable Code Actions Elicit Better LLM Agents

Executable Code Actions Elicit Better LLM Agents

1 February 2024
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Hao Peng
Heng Ji
    ELM
    LLMAG
    LM&Ro
ArXivPDFHTML

Papers citing "Executable Code Actions Elicit Better LLM Agents"

35 / 35 papers shown
Title
Evolution of AI in Education: Agentic Workflows
Evolution of AI in Education: Agentic Workflows
Firuz Kamalov
David Santandreu Calonge
Linda Smail
Dilshod Azizov
Dimple R. Thadani
Theresa Kwong
Amara Atif
50
1
0
25 Apr 2025
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAG
OffRL
KELM
LRM
86
0
0
25 Apr 2025
Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution
Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution
Junjie Chen
Hao Li
Jingli Yang
Yong-Jin Liu
Qingyao Ai
LLMAG
82
0
0
23 Apr 2025
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang
Zichen Ding
Chang Ma
Zijie Chen
Qiushi Sun
Zhenzhong Lan
Junxian He
135
0
0
14 Apr 2025
ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines
ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines
Tengjun Jin
Yuxuan Zhu
Daniel Kang
LMTD
ELM
47
0
0
07 Apr 2025
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute
Yingwei Ma
Binhua Li
Yihong Dong
Xue Jiang
Rongyu Cao
J. Chen
Fei Huang
Yongqian Li
LLMAG
LRM
62
0
0
31 Mar 2025
SandboxEval: Towards Securing Test Environment for Untrusted Code
SandboxEval: Towards Securing Test Environment for Untrusted Code
Rafiqul Rabin
Jesse Hostetler
Sean McGregor
Brett Weir
Nick Judd
ELM
39
0
0
27 Mar 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
Haoyu Wang
Christopher M. Poskitt
Jun Sun
37
0
0
24 Mar 2025
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval
Yu Zhang
Shutong Qiao
Jiaqi Zhang
Tzu-Heng Lin
Chen Gao
Yongqian Li
LM&Ro
LM&MA
90
1
0
07 Mar 2025
Personalize Your LLM: Fake it then Align it
Yijing Zhang
Dyah Adila
Changho Shin
Frederic Sala
88
0
0
02 Mar 2025
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling
MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling
Yakun Zhu
Shaohang Wei
Xu Wang
Kui Xue
Xiaofan Zhang
S. Zhang
62
1
0
17 Feb 2025
Evaluating Agent-based Program Repair at Google
Evaluating Agent-based Program Repair at Google
Pat Rondon
Renyao Wei
J. Cambronero
Jürgen Cito
Aaron Sun
S. Sanyam
Michele Tufano
S. Chandra
44
3
0
13 Jan 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng
Ge Zhang
Tianhao Shen
Xueling Liu
Bill Yuchen Lin
Jie Fu
Wenhu Chen
Xiang Yue
SyDa
91
102
0
08 Jan 2025
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
H. Zhang
Xiaoman Pan
Hongwei Wang
Kaixin Ma
W. Yu
Dong Yu
LLMAG
61
3
0
03 Jan 2025
LABIIUM: AI-Enhanced Zero-configuration Measurement Automation System
LABIIUM: AI-Enhanced Zero-configuration Measurement Automation System
Emmanuel A. Olowe
Danial Chitnis
72
0
0
07 Dec 2024
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Bohan Lyu
Yadi Cao
Duncan Watson-Parris
Leon Bergen
Taylor Berg-Kirkpatrick
Rose Yu
61
3
0
01 Nov 2024
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Antonis Antoniades
Albert Örwall
Kexun Zhang
Yuxi Xie
Anirudh Goyal
William Yang Wang
LLMAG
56
11
0
26 Oct 2024
Beyond Browsing: API-Based Web Agents
Beyond Browsing: API-Based Web Agents
Yueqi Song
Frank F. Xu
Shuyan Zhou
Graham Neubig
55
15
0
21 Oct 2024
Can Large Language Models Invent Algorithms to Improve Themselves?
Can Large Language Models Invent Algorithms to Improve Themselves?
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
AIFin
LRM
34
1
0
21 Oct 2024
AgentSquare: Automatic LLM Agent Search in Modular Design Space
AgentSquare: Automatic LLM Agent Search in Modular Design Space
Yu Shang
Yu Li
Keyu Zhao
Likai Ma
Jiaheng Liu
Fengli Xu
Yong Li
LLMAG
50
9
0
08 Oct 2024
MILE: A Mutation Testing Framework of In-Context Learning Systems
MILE: A Mutation Testing Framework of In-Context Learning Systems
Zeming Wei
Yihao Zhang
Meng Sun
45
0
0
07 Sep 2024
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents
From Grounding to Planning: Benchmarking Bottlenecks in Web Agents
Segev Shlomov
Ben Wiesel
Aviad Sela
Ido Levy
Liane Galanti
Roy Abitbol
LLMAG
34
3
0
03 Sep 2024
DreamFactory: Pioneering Multi-Scene Long Video Generation with a
  Multi-Agent Framework
DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework
Zhifei Xie
Daniel Tang
Dingwei Tan
Jacques Klein
Tegawend F. Bissyand
Saad Ezzini
VGen
32
8
0
21 Aug 2024
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
Haolin Jin
Linghan Huang
Haipeng Cai
Jun Yan
Bo Li
Huaming Chen
78
24
0
05 Aug 2024
Towards Unified Alignment Between Agents, Humans, and Environment
Towards Unified Alignment Between Agents, Humans, and Environment
Zonghan Yang
An Liu
Zijun Liu
Kai Liu
Fangzhou Xiong
...
Zhenhe Zhang
Fuwen Luo
Zhicheng Guo
Peng Li
Yang Liu
32
4
0
12 Feb 2024
FireAct: Toward Language Agent Fine-tuning
FireAct: Toward Language Agent Fine-tuning
Baian Chen
Chang Shu
Ehsan Shareghi
Nigel Collier
Karthik Narasimhan
Shunyu Yao
ALM
LLMAG
99
97
0
09 Oct 2023
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language
  Feedback
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
130
141
0
19 Sep 2023
Measuring and Improving Chain-of-Thought Reasoning in Vision-Language
  Models
Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models
Yangyi Chen
Karan Sikka
Michael Cogswell
Heng Ji
Ajay Divakaran
LRM
36
24
0
08 Sep 2023
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
232
1,742
0
07 Apr 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
240
2,494
0
06 Oct 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language
  Models
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
120
624
0
22 Sep 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,248
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
367
8,495
0
28 Jan 2022
Measuring Coding Challenge Competence With APPS
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
624
0
20 May 2021
1