Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.13301
Cited By
v1
v2
v3
v4 (latest)
Training Diffusion Models with Reinforcement Learning
22 May 2023
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Training Diffusion Models with Reinforcement Learning"
50 / 78 papers shown
Title
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models
Fengqi Zhu
Rongzhen Wang
Shen Nie
Xiaolu Zhang
Chunwei Wu
...
Jun Zhou
Jianfei Chen
Yankai Lin
Ji-Rong Wen
Chongxuan Li
171
0
0
25 May 2025
Rethinking Direct Preference Optimization in Diffusion Models
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
762
0
0
24 May 2025
InfLVG: Reinforce Inference-Time Consistent Long Video Generation with GRPO
Xueji Fang
Liyuan Ma
Zhiyang Chen
Mingyuan Zhou
Guo-Jun Qi
VGen
224
0
0
23 May 2025
Scaling Image and Video Generation via Test-Time Evolutionary Search
Haoran He
Jiajun Liang
X. Wang
Pengfei Wan
Di Zhang
Kun Gai
Ling Pan
DiffM
221
0
0
23 May 2025
Minimum-Excess-Work Guidance
Christopher Kolloff
Tobias Höppe
Emmanouil Angelis
Mathias Jacob Schreiner
Stefan Bauer
Andrea Dittadi
Simon Olsson
OT
91
0
0
19 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Haoyang Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
224
0
0
12 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yongqian Li
Jiaheng Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
166
4
0
08 May 2025
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
Xiaohui Sun
Ruitong Xiao
Jianye Mo
Bowen Wu
Qun Yu
Baoxun Wang
87
2
0
03 Apr 2025
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
94
0
0
28 Mar 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
121
3
0
25 Mar 2025
EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment
Yufei Zhu
Yiming Zhong
Zemin Yang
Peishan Cong
Jingyi Yu
X. Zhu
Y. Ma
88
1
0
18 Mar 2025
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
Zijing Hu
Fengda Zhang
Long Chen
Kun Kuang
Jiahui Li
Kaifeng Gao
Jun Xiao
X. Wang
Wenwu Zhu
EGVM
207
4
0
14 Mar 2025
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent
Kyle Hsu
Justin Johnson
L. Fei-Fei
Jiajun Wu
DiffM
MU
138
7
0
14 Mar 2025
Controllable Latent Diffusion for Traffic Simulation
Yizhuo Xiao
Mustafa Suphi Erden
Cheng Wang
79
0
0
14 Mar 2025
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
Itay Chachy
Guy Yariv
Sagie Benaim
439
0
0
12 Mar 2025
Aligning Text to Image in Diffusion Models is Easier Than You Think
J. Lee
Byunghee Cha
Jeongsol Kim
Jong Chul Ye
90
1
0
11 Mar 2025
Preference-Based Alignment of Discrete Diffusion Models
Umberto Borso
Davide Paglieri
Jude Wells
Tim Rocktaschel
104
3
0
11 Mar 2025
Boosting Diffusion-Based Text Image Super-Resolution Model Towards Generalized Real-World Scenarios
Chenglu Pan
Xiaogang Xu
Ganggui Ding
Yunke Zhang
Wenbo Li
Jiarong Xu
Qingbiao Wu
101
0
0
10 Mar 2025
A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning
Shashank Gupta
Chaitanya Ahuja
Tsung-Yu Lin
Sreya Dutta Roy
Harrie Oosterhuis
Maarten de Rijke
Satya Narayan Shukla
95
1
0
02 Mar 2025
Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport
Mingyang Sun
Pengxiang Ding
Weinan Zhang
Donglin Wang
OT
127
0
0
24 Feb 2025
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation
Minghao Fu
Guo-Hua Wang
Liangfu Cao
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
54
0
0
18 Feb 2025
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka
Alejandro Escontrela
Pieter Abbeel
Yi-An Ma
DiffM
140
31
0
17 Feb 2025
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
Junjie Wen
Yinlin Zhu
Jinming Li
Zhibin Tang
Yaxin Peng
Feifei Feng
VLM
108
22
0
09 Feb 2025
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
D. Yao
Wenpin Tang
106
1
0
03 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
Jun Zhu
146
2
0
28 Jan 2025
Improving Video Generation with Human Feedback
Jie Liu
Gongye Liu
Jiajun Liang
Ziyang Yuan
Xiaokun Liu
...
Pengfei Wan
Di Zhang
Kun Gai
Yujiu Yang
Wanli Ouyang
VGen
EGVM
140
22
0
23 Jan 2025
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
Yiyang Wang
Xi Chen
Xiaogang Xu
S. Ji
Yongxu Liu
Yujun Shen
Hengshuang Zhao
DiffM
103
0
0
21 Jan 2025
A General Framework for Inference-time Scaling and Steering of Diffusion Models
R. Singhal
Zachary Horvitz
Ryan Teehan
Mengye Ren
Zhou Yu
Kathleen McKeown
Rajesh Ranganath
DiffM
118
27
0
17 Jan 2025
Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints
Jonathan Nöther
Adish Singla
Goran Radanović
AAML
129
0
0
14 Jan 2025
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
Hui Zhang
Zuxuan Wu
Zhen Xing
Jie Shao
Yu-Gang Jiang
127
11
0
31 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
174
5
0
10 Dec 2024
DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling
Xin Xie
Dong Gong
137
1
0
01 Dec 2024
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Zhiwei Jia
Yuesong Nan
Huixi Zhao
Gengdai Liu
EGVM
157
1
0
22 Nov 2024
Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation
Huy Le
Miroslav Gabriel
Tai Hoang
Gerhard Neumann
Ngo Anh Vien
146
1
0
22 Nov 2024
Training Free Guided Flow Matching with Optimal Control
Luran Wang
Chaoran Cheng
Yizhen Liao
Yanru Qu
Ge Liu
70
3
0
23 Oct 2024
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
113
10
0
16 Oct 2024
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang
Ling Yang
Ge Li
Yaqi Cai
Jiake Xie
Yong Tang
Yujiu Yang
Mengdi Wang
Bin Cui
EGVM
CoGe
69
10
0
09 Oct 2024
Training-free Diffusion Model Alignment with Sampling Demons
Po-Hung Yeh
Kuang-Huei Lee
Jun-Cheng Chen
81
9
0
08 Oct 2024
HERO: Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning
Ayano Hiranaka
Shang-Fu Chen
Chieh-Hsin Lai
Dongjun Kim
Naoki Murata
Takashi Shibuya
Wei-Hsiang Liao
Shao-Hua Sun
Yuki Mitsufuji
74
2
0
07 Oct 2024
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
90
4
0
05 Oct 2024
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Yulai Zhao
Masatoshi Uehara
Gabriele Scalia
Tommaso Biancalani
Sergey Levine
Ehsan Hajiramezanali
Ehsan Hajiramezanali
AI4CE
105
6
0
17 Jun 2024
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Hao Wen
Zehuan Huang
Yaohui Wang
Xinyuan Chen
Yu Qiao
149
9
0
05 Jun 2024
Amortizing intractable inference in diffusion models for vision, language, and control
S. Venkatraman
Moksh Jain
Luca Scimeca
Minsu Kim
Marcin Sendera
...
Alexandre Adam
Jarrid Rector-Brooks
Yoshua Bengio
Glen Berseth
Nikolay Malkin
137
31
0
31 May 2024
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
N. Sebe
Mubarak Shah
EGVM
127
7
0
22 May 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
227
21
0
28 Feb 2024
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training
Xinyan Chen
Jiaxin Ge
Tianjun Zhang
Jiaming Liu
Shanghang Zhang
VLM
EGVM
100
0
0
23 Dec 2023
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
122
5
0
13 Dec 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
149
13
0
28 Aug 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
125
376
0
12 Apr 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
347
1,189
0
07 Mar 2023
1
2
Next