ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.02783
119
1
v1v2 (latest)

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging

4 March 2025
Jie Wu
Haoling Li
Xin Zhang
Jianwen Luo
Yangyu Huang
Ruihang Chu
Yue Yang
Scarlett Li
ArXiv (abs)PDFHTML
Abstract

Preference learning enhances Code LLMs beyond supervised fine-tuning by leveraging relative quality comparisons. Existing methods construct preference pairs from candidates based on test case success, treating the higher pass rate sample as positive and the lower as negative. However, this approach does not pinpoint specific errors in the code, which prevents the model from learning more informative error correction patterns, as aligning failing code as a whole lacks the granularity needed to capture meaningful error-resolution relationships. To address these issues, we propose IterPref, a new preference alignment framework that mimics human iterative debugging to refine Code LLMs. IterPref explicitly locates error regions and aligns the corresponding tokens via a tailored DPO algorithm. To generate informative pairs, we introduce the CodeFlow dataset, where samples are iteratively refined until passing tests, with modifications capturing error corrections. Extensive experiments show that a diverse suite of Code LLMs equipped with IterPref achieves significant performance gains in code generation and improves on challenging tasks like BigCodeBench. In-depth analysis reveals that IterPref yields fewer errors. Our code and data will be made publicaly available.

View on arXiv
@article{wu2025_2503.02783,
  title={ IterPref: Focal Preference Learning for Code Generation via Iterative Debugging },
  author={ Jie Wu and Haoling Li and Xin Zhang and Jianwen Luo and Yangyu Huang and Ruihang Chu and Yujiu Yang and Scarlett Li },
  journal={arXiv preprint arXiv:2503.02783},
  year={ 2025 }
}
Comments on this paper