Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.18290
Cited By
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
29 May 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Direct Preference Optimization: Your Language Model is Secretly a Reward Model"
50 / 2,637 papers shown
Title
Discriminator-Free Direct Preference Optimization for Video Diffusion
Haoran Cheng
Qide Dong
Liang Peng
Zhizhou Sha
Weiguo Feng
Jinghui Xie
Zhao Song
Shilei Wen
Xiaofei He
Boxi Wu
VGen
212
0
0
11 Apr 2025
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Xing Han Lù
Amirhossein Kazemnejad
Nicholas Meade
Arkil Patel
Dongchan Shin
Alejandra Zambrano
Karolina Stañczak
Peter Shaw
Christopher Pal
Siva Reddy
LLMAG
50
1
0
11 Apr 2025
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Jingyang Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yong Li
Dacheng Tao
OffRL
31
0
0
10 Apr 2025
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Rosie Zhao
Alexandru Meterez
Sham Kakade
Cengiz Pehlevan
Samy Jelassi
Eran Malach
ReLM
LRM
186
2
0
10 Apr 2025
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu
Kangheng Lin
Liang Zhao
Jisheng Yin
Yana Wei
...
Zheng Ge
Xiangyu Zhang
Daxin Jiang
Jingyu Wang
Wenbing Tao
VLM
OffRL
LRM
42
4
0
10 Apr 2025
MM-IFEngine: Towards Multimodal Instruction Following
Shengyuan Ding
Shenxi Wu
Xiangyu Zhao
Yuhang Zang
Haodong Duan
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Dahua Lin
Jiaqi Wang
OffRL
60
2
0
10 Apr 2025
Talking Point based Ideological Discourse Analysis in News Events
Nishanth Nakshatri
Nikhil Mehta
Siyi Liu
Sihao Chen
Daniel J. Hopkins
Dan Roth
Dan Goldwasser
34
0
0
10 Apr 2025
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang
Chao Qu
Zuming Huang
Wei Chu
Fangzhen Lin
Wenhu Chen
OffRL
ReLM
SyDa
LRM
VLM
83
2
0
10 Apr 2025
2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization
Mengyang Li
Zhong Zhang
29
0
0
10 Apr 2025
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
Longguang Zhong
Fanqi Wan
Ziyi Yang
Guosheng Liang
Tianyuan Shi
Xiaojun Quan
MoMe
59
0
0
09 Apr 2025
Bridging the Gap Between Preference Alignment and Machine Unlearning
Xiaohua Feng
Yuyuan Li
Huwei Ji
Jiaming Zhang
Lihe Zhang
Tianyu Du
Chaochao Chen
MU
45
0
0
09 Apr 2025
Perception in Reflection
Yana Wei
Liang Zhao
Kangheng Lin
En Yu
Yuang Peng
...
Jianjian Sun
Haoran Wei
Zheng Ge
Xiangyu Zhang
Vishal M. Patel
33
0
0
09 Apr 2025
CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data Optimization
Jing Yao
Xiaoyuan Yi
Jindong Wang
Zhicheng Dou
Xing Xie
31
0
0
09 Apr 2025
Integrating Cognitive Processing Signals into Language Models: A Review of Advances, Applications and Future Directions
Angela Lopez-Cardona
Sebastian Idesis
Ioannis Arapakis
31
0
0
09 Apr 2025
SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog
Jennifer D’Souza
Sameer Sadruddin
Holger Israel
Mathias Begoin
Diana Slawig
68
5
0
09 Apr 2025
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models
Ling Team
Caizhi Tang
Chilin Fu
Chunwei Wu
Jia Guo
...
Shuaicheng Li
Wenjie Qu
Yingting Wu
Y. Liu
Zhenyu Huang
LRM
38
0
0
09 Apr 2025
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Wei Chen
Xin Yan
Bin Wen
Fan Yang
Tingting Gao
Di Zhang
Long Chen
MLLM
97
0
0
09 Apr 2025
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning
Xinpeng Ding
Kaipeng Zhang
Jinahua Han
Lanqing Hong
Hang Xu
Xuelong Li
MLLM
VLM
275
0
0
08 Apr 2025
Sharpness-Aware Parameter Selection for Machine Unlearning
Saber Malekmohammadi
Hong kyu Lee
Li Xiong
MU
244
0
0
08 Apr 2025
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
Jingyuan Zhang
Qi Wang
Xingguang Ji
Yong Liu
Yang Yue
Fuzheng Zhang
Di Zhang
Guorui Zhou
Kun Gai
LRM
44
4
0
08 Apr 2025
Information-Theoretic Reward Decomposition for Generalizable RLHF
Liyuan Mao
Haoran Xu
Amy Zhang
Weinan Zhang
Chenjia Bai
40
0
0
08 Apr 2025
Stratified Expert Cloning with Adaptive Selection for User Retention in Large-Scale Recommender Systems
Chengzhi Lin
Annan Xie
Shuchang Liu
Wuhong Wang
Chuyuan Wang
Yongqi Liu
OffRL
33
0
0
08 Apr 2025
Understanding Machine Unlearning Through the Lens of Mode Connectivity
Jiali Cheng
Hadi Amiri
MU
242
0
0
08 Apr 2025
Can you Finetune your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models
Fay Elhassan
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
187
0
0
08 Apr 2025
R2Vul: Learning to Reason about Software Vulnerabilities with Reinforcement Learning and Structured Reasoning Distillation
Martin Weyssow
Chengran Yang
Junkai Chen
Yikun Li
Huihui Huang
...
Han Wei Ang
Frank Liauw
Eng Lieh Ouh
Lwin Khin Shar
David Lo
LRM
35
0
0
07 Apr 2025
Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models
Yang Yan
Yu Lu
Renjun Xu
Zhenzhong Lan
LRM
44
2
0
07 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Dinesh Manocha
Jieyu Zhao
LRM
86
3
0
07 Apr 2025
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
Benjamin Lipkin
Benjamin LeBrun
Jacob Hoover Vigly
João Loula
David R. MacIver
...
Ryan Cotterell
Vikash K. Mansinghka
Timothy J. O'Donnell
Alexander K. Lew
Tim Vieira
34
0
0
07 Apr 2025
Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors
Fan Nie
Lan Feng
Haotian Ye
Weixin Liang
Pan Lu
Huaxiu Yao
Alexandre Alahi
James Zou
83
0
0
07 Apr 2025
OCC-MLLM-CoT-Alpha: Towards Multi-stage Occlusion Recognition Based on Large Language Models via 3D-Aware Supervision and Chain-of-Thoughts Guidance
Chaoyi Wang
Baoqing Li
Xinhan Di
MLLM
LRM
32
0
0
07 Apr 2025
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations
Pedro Ferreira
Wilker Aziz
Ivan Titov
LRM
31
0
0
07 Apr 2025
CARE: Aligning Language Models for Regional Cultural Awareness
Geyang Guo
Tarek Naous
Hiromi Wakaki
Yukiko Nishimura
Yuki Mitsufuji
Alan Ritter
Wei Xu
59
1
0
07 Apr 2025
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
Anna Goldie
Azalia Mirhoseini
Hao Zhou
Irene Cai
Christopher D. Manning
SyDa
OffRL
ReLM
LRM
117
3
0
07 Apr 2025
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
Cheng Chen
Jiacheng Wei
Tianrun Chen
Chi Zhang
Xiaofeng Yang
...
Bingchen Yang
Chuan-Sheng Foo
Guosheng Lin
Qixing Huang
Fayao Liu
52
1
0
07 Apr 2025
Constitution or Collapse? Exploring Constitutional AI with Llama 3-8B
Xue Zhang
MoMe
ALM
SyDa
ELM
51
0
0
07 Apr 2025
The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning
T. Zheng
Yixiang Chen
Chengxi Li
Chunyang Li
Qing Zong
Haochen Shi
Baixuan Xu
Yangqiu Song
Ginny Wong
Simon See
LRM
44
1
0
07 Apr 2025
Revealing the Intrinsic Ethical Vulnerability of Aligned Large Language Models
Jiawei Lian
Jianhong Pan
L. Wang
Yi Wang
Shaohui Mei
Lap-Pui Chau
AAML
31
0
0
07 Apr 2025
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
Anja Surina
Amin Mansouri
Lars Quaedvlieg
Amal Seddas
Maryna Viazovska
Emmanuel Abbe
Çağlar Gülçehre
38
1
0
07 Apr 2025
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization
Wenyuan Xu
Xiaochen Zuo
Chao Xin
Yu Yue
Lin Yan
Yonghui Wu
OffRL
26
3
0
07 Apr 2025
A Llama walks into the 'Bar': Efficient Supervised Fine-Tuning for Legal Reasoning in the Multi-state Bar Exam
Rean Fernandes
André Biedenkapp
Frank Hutter
Noor H. Awad
ALM
ELM
LRM
45
0
0
07 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGe
LRM
39
0
0
07 Apr 2025
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
199
1
0
07 Apr 2025
LLM-based Automated Grading with Human-in-the-Loop
Hang Li
Yucheng Chu
Kaiqi Yang
Yasemin Copur-Gencturk
Jiliang Tang
AI4Ed
ELM
64
0
0
07 Apr 2025
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
Carlos Peláez-González
Andrés Herrera-Poyatos
Cristina Zuheros
David Herrera-Poyatos
Virilo Tejedor
F. Herrera
AAML
29
0
0
07 Apr 2025
Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval
Kidist Amde Mekonnen
Yubao Tang
Maarten de Rijke
65
0
0
07 Apr 2025
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
Xuerui Su
Shufang Xie
Guoqing Liu
Yingce Xia
Renqian Luo
Peiran Jin
Zhiming Ma
Yue Wang
Zun Wang
Yuting Liu
LRM
36
3
0
06 Apr 2025
ADAPT: Actively Discovering and Adapting to Preferences for any Task
Maithili Patel
Xavier Puig
Ruta Desai
Roozbeh Mottaghi
Sonia Chernova
Joanne Truong
Akshara Rai
43
0
0
05 Apr 2025
MSL: Not All Tokens Are What You Need for Tuning LLM as a Recommender
Bohao Wang
Feng Liu
Jiawei Chen
Xingyu Lou
Changwang Zhang
Jun Wang
Yuegang Sun
Yan Feng
Chong Chen
C. Wang
37
0
0
05 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
33
0
0
05 Apr 2025
MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance
Chen Hu
Timothy Neate
Shan Luo
Letizia Gionfrida
55
0
0
04 Apr 2025
Previous
1
2
3
...
6
7
8
...
51
52
53
Next