Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.18736
Cited By
Rethinking Direct Preference Optimization in Diffusion Models
24 May 2025
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Direct Preference Optimization in Diffusion Models"
33 / 33 papers shown
Title
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
Zijing Hu
Fengda Zhang
Long Chen
Kun Kuang
Jiahui Li
Kaifeng Gao
Jun Xiao
X. Wang
Wenwu Zhu
EGVM
126
4
0
14 Mar 2025
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Hanyang Zhao
Genta Indra Winata
Anirban Das
Shi-Xiong Zhang
D. Yao
Wenpin Tang
Sambit Sahu
76
8
0
05 Oct 2024
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong
Sayak Paul
Noah Lee
Kashif Rasul
James Thorne
Jongheon Jeong
75
15
0
10 Jun 2024
SimPO: Simple Preference Optimization with a Reference-Free Reward
Yu Meng
Mengzhou Xia
Danqi Chen
88
425
0
23 May 2024
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
95
30
0
15 Apr 2024
Aligning Diffusion Models by Optimizing Human Utility
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Yusuke Kato
Kazuki Kozuka
125
31
0
06 Apr 2024
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong
Noah Lee
James Thorne
OSLM
66
232
0
12 Mar 2024
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
Shentao Yang
Tianqi Chen
Mingyuan Zhou
EGVM
66
27
0
13 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
201
510
0
02 Feb 2024
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Kai Yang
Jian Tao
Jiafei Lyu
Chunjiang Ge
Jiaxin Chen
Qimai Li
Weihan Shen
Xiaolong Zhu
Xiu Li
EGVM
42
102
0
22 Nov 2023
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Shafiq Joty
Nikhil Naik
EGVM
79
251
0
21 Nov 2023
A General Theoretical Paradigm to Understand Learning from Human Preferences
M. G. Azar
Mark Rowland
Bilal Piot
Daniel Guo
Daniele Calandriello
Michal Valko
Rémi Munos
112
597
0
18 Oct 2023
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Mihir Prabhudesai
Anirudh Goyal
Deepak Pathak
Katerina Fragkiadaki
67
118
0
05 Oct 2023
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Amita Gajewar
Paul Vicol
G. Bansal
David J Fleet
44
167
0
29 Sep 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
73
274
0
15 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
293
3,712
0
29 May 2023
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
Ying Fan
Olivia Watkins
Yuqing Du
Hao Liu
Moonkyung Ryu
Craig Boutilier
Pieter Abbeel
Mohammad Ghavamzadeh
Kangwook Lee
Kimin Lee
74
154
0
25 May 2023
Training Diffusion Models with Reinforcement Learning
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
74
341
0
22 May 2023
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
Yuval Kirstain
Adam Polyak
Uriel Singer
Shahbuland Matiana
Joe Penna
Omer Levy
EGVM
185
375
0
02 May 2023
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
Ming Cao
Xintao Wang
Zhongang Qi
Ying Shan
Xiaohu Qie
Yinqiang Zheng
DiffM
60
448
0
17 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
87
360
0
12 Apr 2023
Efficient Diffusion Training via Min-SNR Weighting Strategy
Tiankai Hang
Shuyang Gu
Chen Li
Jianmin Bao
Dong Chen
Han Hu
Xin Geng
B. Guo
43
154
0
16 Mar 2023
Diffusion Models Generate Images Like Painters: an Analytical Theory of Outline First, Details Later
Binxu Wang
John J. Vastola
DiffM
106
24
0
04 Mar 2023
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
140
817
0
02 Nov 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
162
1,089
0
22 Jun 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
298
5,904
0
23 May 2022
Perception Prioritized Training of Diffusion Models
Jooyoung Choi
Jungbeom Lee
Chaehun Shin
Sungwon Kim
Hyunwoo J. Kim
Sung-Hoon Yoon
DiffM
56
239
0
01 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
710
12,525
0
04 Mar 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
270
15,081
0
20 Dec 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
696
28,659
0
26 Feb 2021
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffM
SyDa
270
6,293
0
26 Nov 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
334
17,550
0
19 Jun 2020
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDa
DiffM
186
3,803
0
12 Jul 2019
1