ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization

6 February 2025

Abstract

Recent research has leveraged large language model multi-agent systems for complex problem-solving while trying to reduce the manual effort required to build them, driving the development of automated agent workflow optimization methods. However, existing methods remain inflexible due to representational limitations, a lack of adaptability, and poor scalability when relying on discrete optimization techniques. We address these challenges with ScoreFlow, a simple yet high-performance framework that leverages efficient gradient-based optimization in a continuous space. ScoreFlow incorporates Score-DPO, a novel variant of the direct preference optimization method that accounts for quantitative feedback. Across six benchmarks spanning question answering, coding, and mathematical reasoning, ScoreFlow achieves an 8.2% improvement over existing baselines. Moreover, it empowers smaller models to outperform larger ones with lower inference costs. Project:this https URL

View on arXiv

@article{wang2025_2502.04306,
  title={ ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization },
  author={ Yinjie Wang and Ling Yang and Guohao Li and Mengdi Wang and Bryon Aragam },
  journal={arXiv preprint arXiv:2502.04306},
  year={ 2025 }
}

Comments on this paper