ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.02783
  4. Cited By
v1v2 (latest)

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging

4 March 2025
Jie Wu
Haoling Li
Xin Zhang
Jianwen Luo
Yangyu Huang
Ruihang Chu
Yue Yang
Scarlett Li
ArXiv (abs)PDFHTML

Papers citing "IterPref: Focal Preference Learning for Code Generation via Iterative Debugging"

23 / 23 papers shown
Title
Dynamic Scaling of Unit Tests for Code Reward Modeling
Zeyao Ma
Yanling Wang
Jing Zhang
Jifan Yu
Sijia Luo
Xiangbo Shu
ALMLRM
72
4
0
03 Jan 2025
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code
  to Improve Code LMs
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs
Zhihan Liu
Shenao Zhang
Yongfei Liu
Boyi Liu
Yingxiang Yang
Zhaoran Wang
175
5
0
20 Nov 2024
Aligning CodeLLMs with Direct Preference Optimization
Aligning CodeLLMs with Direct Preference Optimization
Yibo Miao
Bofei Gao
Shanghaoran Quan
Junyang Lin
Daoguang Zan
Qingbin Liu
Jian Yang
Tianyu Liu
Zhijie Deng
104
6
0
24 Oct 2024
Process Supervision-Guided Policy Optimization for Code Generation
Process Supervision-Guided Policy Optimization for Code Generation
Ning Dai
Zheng Wu
Renjie Zheng
Ziyun Wei
Wenlei Shi
Xing Jin
Guanlin Liu
Chen Dun
Liang Huang
Lin Yan
107
11
0
23 Oct 2024
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of
  LLMs
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Xin Lai
Zhuotao Tian
Yukang Chen
Senqiao Yang
Xiangru Peng
Jiaya Jia
LRM
160
126
0
26 Jun 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
206
193
0
22 Jun 2024
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Leonidas Gee
Milan Gritta
Gerasimos Lampouras
Ignacio Iacobacci
98
10
0
18 Jun 2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is
  Implicitly an Adversarial Regularizer
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
Zhihan Liu
Miao Lu
Shenao Zhang
Boyi Liu
Hongyi Guo
Yingxiang Yang
Jose H. Blanchet
Zhaoran Wang
135
62
0
26 May 2024
LiveCodeBench: Holistic and Contamination Free Evaluation of Large
  Language Models for Code
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain
King Han
Alex Gu
Wen-Ding Li
Fanjia Yan
Tianjun Zhang
Sida I. Wang
Armando Solar-Lezama
Koushik Sen
Ion Stoica
ELM
132
449
0
12 Mar 2024
StarCoder 2 and The Stack v2: The Next Generation
StarCoder 2 and The Stack v2: The Next Generation
Anton Lozhkov
Raymond Li
Loubna Ben Allal
Federico Cassano
J. Lamy-Poirier
...
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
OSLMELM
81
357
0
29 Feb 2024
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Arka Pal
Deep Karkhanis
Samuel Dooley
Manley Roberts
Siddartha Naidu
Colin White
OSLM
91
155
0
20 Feb 2024
Noise Contrastive Alignment of Language Models with Explicit Rewards
Noise Contrastive Alignment of Language Models with Explicit Rewards
Huayu Chen
Guande He
Lifan Yuan
Ganqu Cui
Hang Su
Jun Zhu
103
56
0
08 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
283
569
0
02 Feb 2024
DeepSeek-Coder: When the Large Language Model Meets Programming -- The
  Rise of Code Intelligence
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Daya Guo
Qihao Zhu
Dejian Yang
Zhenda Xie
Kai Dong
...
Yu-Huan Wu
Yiming Li
Fuli Luo
Yingfei Xiong
W. Liang
ELM
125
798
0
25 Jan 2024
A Survey of Reinforcement Learning from Human Feedback
A Survey of Reinforcement Learning from Human Feedback
Timo Kaufmann
Paul Weng
Viktor Bengs
Eyke Hüllermeier
OffRL
93
155
0
22 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
Enze Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLMLRMAI4CE
176
84
0
17 Dec 2023
Magicoder: Empowering Code Generation with OSS-Instruct
Magicoder: Empowering Code Generation with OSS-Instruct
Yuxiang Wei
Zhe Wang
Jiawei Liu
Yifeng Ding
Lingming Zhang
SyDa
88
118
0
04 Dec 2023
Qwen Technical Report
Qwen Technical Report
Jinze Bai
Shuai Bai
Yunfei Chu
Zeyu Cui
Kai Dang
...
Zhenru Zhang
Chang Zhou
Jingren Zhou
Xiaohuan Zhou
Tianhang Zhu
OSLM
268
1,908
0
28 Sep 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,169
0
29 May 2023
StarCoder: may the source be with you!
StarCoder: may the source be with you!
Raymond Li
Loubna Ben Allal
Yangtian Zi
Niklas Muennighoff
Denis Kocetkov
...
Sean M. Hughes
Thomas Wolf
Arjun Guha
Leandro von Werra
H. D. Vries
129
789
0
09 May 2023
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of
  Large Language Models for Code Generation
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELMALM
261
958
0
02 May 2023
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELMAIMatReCodALM
216
2,009
0
16 Aug 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
238
5,675
0
07 Jul 2021
1