Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11176
Cited By
Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement
17 June 2024
Weimin Xiong
Yifan Song
Xiutian Zhao
Wenhao Wu
Xun Wang
Ke Wang
Cheng Li
Wei Peng
Sujian Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement"
22 / 22 papers shown
Title
Semantic Probabilistic Control of Language Models
Kareem Ahmed
Catarina G Belém
Padhraic Smyth
Sameer Singh
42
0
0
04 May 2025
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li
Zhi Gao
Bofei Zhang
Yapeng Mi
Xiaojian Ma
...
Tao Yuan
Yuwei Wu
Yunde Jia
Song-Chun Zhu
Qing Li
LLMAG
70
0
0
30 Apr 2025
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
Yahan Yang
Soham Dan
Shuo Li
Dan Roth
Insup Lee
LRM
36
0
0
21 Apr 2025
Exploring Expert Failures Improves LLM Agent Tuning
Li-Cheng Lan
Andrew Bai
Minhao Cheng
Ruochen Wang
Cho-Jui Hsieh
LRM
159
0
0
17 Apr 2025
A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions
Emre Can Acikgoz
Cheng Qian
Hongru Wang
Vardhan Dongre
Xiusi Chen
Heng Ji
Dilek Hakkani-Tur
Gokhan Tur
LM&Ro
ELM
55
1
0
07 Apr 2025
WorkTeam: Constructing Workflows from Natural Language with Multi-Agents
Hanchao Liu
Rongjun Li
Weimin Xiong
Ziyu Zhou
Wei Peng
LLMAG
77
0
0
28 Mar 2025
DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning
Fucai Ke
Vijay Kumar B G
Xingjian Leng
Zhixi Cai
Zaid Khan
Weiqing Wang
P. D. Haghighi
H. Rezatofighi
Manmohan Chandraker
44
0
0
25 Mar 2025
A Survey on the Optimization of Large Language Model-based Agents
Shangheng Du
Jiabao Zhao
Jinxin Shi
Zhentao Xie
Xin Jiang
Yanhong Bai
Liang He
LLMAG
LM&Ro
LM&MA
220
1
0
16 Mar 2025
MPO: Boosting LLM Agents with Meta Plan Optimization
Weimin Xiong
Yifan Song
Qingxiu Dong
Bingchan Zhao
Feifan Song
Xun Wang
Sujian Li
LLMAG
81
0
0
04 Mar 2025
ATLaS: Agent Tuning via Learning Critical Steps
Zhixun Chen
Ming Li
Y. Huang
Yali Du
Meng Fang
Dinesh Manocha
83
3
0
04 Mar 2025
AgentRM: Enhancing Agent Generalization with Reward Modeling
Yu Xia
Jingru Fan
Weize Chen
Siyu Yan
Xin Cong
Zhong Zhang
Yaojie Lu
Yankai Lin
Zhiyuan Liu
Maosong Sun
56
1
0
25 Feb 2025
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Zhenfang Chen
Delin Chen
Rui Sun
Wenjun Liu
Chuang Gan
LLMAG
60
3
0
17 Feb 2025
Outcome-Refining Process Supervision for Code Generation
Zhuohao Yu
Weizheng Gu
Yidong Wang
Zhengran Zeng
Jindong Wang
Wei Ye
Shikun Zhang
LRM
89
4
0
19 Dec 2024
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
Weize Chen
Jiarui Yuan
Chen Qian
Cheng Yang
Zhiyuan Liu
Maosong Sun
LLMAG
28
4
0
10 Oct 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
31
9
0
10 Oct 2024
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao
Feifan Song
Yibo Miao
Zefan Cai
Z. Yang
...
Houfeng Wang
Zhifang Sui
Peiyi Wang
Baobao Chang
Baobao Chang
50
11
0
04 Sep 2024
Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction
Reza Khanmohammadi
A. Ghanem
Kyle Verdecchia
Ryan Hall
Mohamed Elshaikh
...
Bing Luo
I. Chetty
Tuka Alhanai
Kundan Thind
Mohammad M. Ghassemi
47
0
0
08 Aug 2024
A Survey on Large Language Model-Based Game Agents
Sihao Hu
Tiansheng Huang
Gaowen Liu
Ramana Rao Kompella
Gaowen Liu
Selim Furkan Tekin
Yichang Xu
Zachary Yahn
Ling Liu
LLMAG
LM&Ro
AI4CE
LM&MA
71
51
0
02 Apr 2024
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D. Co-Reyes
Rishabh Agarwal
Ankesh Anand
Piyush Patil
...
Yamini Bansal
Ethan Dyer
Behnam Neyshabur
Jascha Narain Sohl-Dickstein
Noah Fiedel
ALM
LRM
ReLM
SyDa
157
144
0
11 Dec 2023
FireAct: Toward Language Agent Fine-tuning
Baian Chen
Chang Shu
Ehsan Shareghi
Nigel Collier
Karthik Narasimhan
Shunyu Yao
ALM
LLMAG
99
97
0
09 Oct 2023
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
249
2,494
0
06 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
1