Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

31 May 2025

Main:8 Pages

6 Figures

Bibliography:8 Pages

7 Tables

Appendix:7 Pages

Abstract

Directly training Large Language Models (LLMs) for Multi-Agent Systems (MAS) remains challenging due to intricate reward modeling, dynamic agent interactions, and demanding generalization requirements. This paper explores whether post-training techniques, specifically Supervised Fine-Tuning (SFT) and Reinforcement Learning with Verifiable Rewards (RLVR), can effectively $\textit{generalize}$ to multi-agent scenarios. We use economic reasoning as a testbed, leveraging its strong foundations in mathematics and game theory, its demand for structured analytical reasoning, and its relevance to real-world applications such as market design, resource allocation, and policy analysis. We introduce $\textbf{Recon}$ ( $\textbf{R}$ easoning like an $\textbf{ECON}$ omist), a 7B-parameter open-source LLM post-trained on a hand-curated dataset of 2,100 high-quality economic reasoning problems. Comprehensive evaluation on economic reasoning benchmarks and multi-agent games reveals clear improvements in structured reasoning and economic rationality. These results underscore the promise of domain-aligned post-training for enhancing reasoning and agent alignment, shedding light on the roles of SFT and RL in shaping model behavior. Code is available atthis https URL.

View on arXiv

@article{zhou2025_2506.00577,
  title={ Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs },
  author={ Yufa Zhou and Shaobo Wang and Xingyu Dong and Xiangqi Jin and Yifang Chen and Yue Min and Kexin Yang and Xingzhang Ren and Dayiheng Liu and Linfeng Zhang },
  journal={arXiv preprint arXiv:2506.00577},
  year={ 2025 }
}

Comments on this paper