ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.03817
  4. Cited By
From Novice to Expert: LLM Agent Policy Optimization via Step-wise
  Reinforcement Learning

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning

6 November 2024
Zhirui Deng
Zhicheng Dou
Bo Li
Zhicheng Dou
Ruibin Xiong
Mang Wang
Xin Wu
ArXivPDFHTML

Papers citing "From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning"

5 / 5 papers shown
Title
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Pengxiang Li
Zhi Gao
Bofei Zhang
Yapeng Mi
Xiaojian Ma
...
Tao Yuan
Yuwei Wu
Yunde Jia
Song-Chun Zhu
Qing Li
LLMAG
72
0
0
30 Apr 2025
Exploring Expert Failures Improves LLM Agent Tuning
Exploring Expert Failures Improves LLM Agent Tuning
Li-Cheng Lan
Andrew Bai
Minhao Cheng
Ruochen Wang
Cho-Jui Hsieh
LRM
159
0
0
17 Apr 2025
A Survey on the Optimization of Large Language Model-based Agents
A Survey on the Optimization of Large Language Model-based Agents
Shangheng Du
Jiabao Zhao
Jinxin Shi
Zhentao Xie
Xin Jiang
Yanhong Bai
Liang He
LLMAG
LM&Ro
LM&MA
220
1
0
16 Mar 2025
AgentRM: Enhancing Agent Generalization with Reward Modeling
AgentRM: Enhancing Agent Generalization with Reward Modeling
Yu Xia
Jingru Fan
Weize Chen
Siyu Yan
Xin Cong
Zhong Zhang
Yaojie Lu
Yankai Lin
Zhiyuan Liu
Maosong Sun
56
1
0
25 Feb 2025
A Decade of Deep Learning: A Survey on The Magnificent Seven
A Decade of Deep Learning: A Survey on The Magnificent Seven
Dilshod Azizov
Muhammad Arslan Manzoor
Velibor Bojkovic
Yingxu Wang
Zhilin Wang
...
Liang Li
Siwei Liu
Yu Zhong
Wei Liu
Shangsong Liang
OOD
AI4TS
MedIm
120
0
0
13 Dec 2024
1