AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement

30 January 2026

Libin Qiu

Zhirong Gao

Junfu Chen

Yuhang Ye

Weizhi Huang

Xiaobo Xue

Wenkai Qiu

Shuo Tang

ArXiv (abs)PDF HTML

Main:7 Pages

8 Figures

Bibliography:3 Pages

10 Tables

Appendix:30 Pages

Abstract

Large language model agents often fail to accumulate knowledge from experience, treating each task as an independent challenge. Recent methods extract experience as flattened textual knowledge, which cannot capture procedural logic of complex subtasks. They also lack maintenance mechanisms, causing repository degradation as experience accumulates. We introduce AutoRefine, a framework that extracts and maintains dual-form Experience Patterns from agent execution histories. For procedural subtasks, we extract specialized subagents with independent reasoning and memory. For static knowledge, we extract skill patterns as guidelines or code snippets. A continuous maintenance mechanism scores, prunes, and merges patterns to prevent repository degradation. Evaluated on ALFWorld, ScienceWorld, and TravelPlanner, AutoRefine achieves 98.4%, 70.4%, and 27.1% respectively, with 20-73% step reductions. On TravelPlanner, automatic extraction exceeds manually designed systems (27.1% vs 12.1%), demonstrating its ability to capture procedural coordination.

View on arXiv

Comments on this paper