Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 826 papers shown
Title
DreamOmni: Unified Image Generation and Editing
Bin Xia
Yuechen Zhang
Jingyao Li
Chengyao Wang
Yitong Wang
Xinglong Wu
Bei Yu
Jiaya Jia
SyDa
MLLM
94
3
0
22 Dec 2024
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
Quan Dao
Hao Phung
T. Dao
Dimitris Metaxas
Anh Tran
98
1
0
22 Dec 2024
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer
Yangqiu Song
Xihua Wang
Ruihua Song
Wenbing Huang
DiffM
VGen
80
1
0
21 Dec 2024
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization
Vivek Ramanujan
Kushal Tirumala
Armen Aghajanyan
Luke Zettlemoyer
Ali Farhadi
DiffM
76
2
0
20 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
129
9
0
19 Dec 2024
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
A. Schwing
Yuki Mitsufuji
VGen
126
12
0
19 Dec 2024
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
Hao Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
108
1
0
18 Dec 2024
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models
Xinghang Li
Peiyan Li
Minghuan Liu
Dong Wang
Jirong Liu
Bingyi Kang
Xiao Ma
Tao Kong
Hanbo Zhang
Huaping Liu
LM&Ro
99
18
0
18 Dec 2024
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
Hanzhong Guo
Hongwei Yi
Daquan Zhou
Alexander William Bergman
Michael Lingelbach
Yizhou Yu
DiffM
85
1
0
18 Dec 2024
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Zhuowen Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
86
0
0
16 Dec 2024
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Felix Taubner
Ruihang Zhang
Mathieu Tuli
David B. Lindell
80
2
0
16 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
77
1
0
16 Dec 2024
IGR: Improving Diffusion Model for Garment Restoration from Person Image
Le Shen
Rong Huang
Zhijie Wang
DiffM
107
2
0
16 Dec 2024
InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Rick Akkerman
Haiwen Feng
M. Black
Dimitrios Tzionas
Victoria Fernandez-Abrevaya
VGen
AI4CE
105
3
0
16 Dec 2024
UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models
Yuning Han
Bingyin Zhao
Rui Chu
Feng Luo
Biplab Sikdar
Yingjie Lao
DiffM
AAML
86
1
0
16 Dec 2024
ColorFlow: Retrieval-Augmented Image Sequence Colorization
Junhao Zhuang
Xuan Ju
Zhe Zhang
Yong-Jin Liu
Shiyi Zhang
Chun Yuan
Ying Shan
DiffM
110
1
0
16 Dec 2024
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
111
1
0
16 Dec 2024
FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity Prediction
Alex Morehead
Jianlin Cheng
OOD
121
2
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
211
2
0
14 Dec 2024
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
DiffM
VGen
102
4
0
13 Dec 2024
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
89
3
0
12 Dec 2024
Learned Compression for Compressed Learning
Dan G. Jacobellis
N. Yadwadkar
84
0
0
12 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhaohui Hou
Zhicheng Zhao
VGen
93
0
0
12 Dec 2024
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong
Jiadong Pan
Liang-Sheng Li
Yuankai Qi
Yuxin Peng
Anton Van Den Hengel
Jian Yang
Qingming Huang
92
6
0
12 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip Torr
VLM
ObjD
239
0
0
12 Dec 2024
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Tianwei Yin
Qiang Zhang
Richard Zhang
William T. Freeman
F. Durand
Eli Shechtman
Xun Huang
VGen
DiffM
81
5
0
10 Dec 2024
LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors
Yusuf Dalva
Yuan Li
Qing Liu
Nanxuan Zhao
Jianming Zhang
Zhe Lin
Pinar Yanardag
AI4CE
64
1
0
05 Dec 2024
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang
Wanggui He
Quanyu Long
Yandi Wang
Haoyuan Li
...
Fangxun Shu
Long Chen
Hao Jiang
Leilei Gan
Fei Wu
EGVM
241
3
0
05 Dec 2024
Safeguarding Text-to-Image Generation via Inference-Time Prompt-Noise Optimization
Jiangweizhi Peng
Zhiwei Tang
Gaowen Liu
Charles Fleming
Mingyi Hong
81
2
0
05 Dec 2024
Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting
Guangben Lu
Yuzhen Du
Zhimin Sun
Ran Yi
Yifan Qi
Yizhe Tang
Tianyi Wang
Lizhuang Ma
Fangyuan Zou
DiffM
80
1
0
05 Dec 2024
Coordinate In and Value Out: Training Flow Transformers in Ambient Space
Yuyang Wang
Anurag Ranjan
J. Susskind
Miguel Angel Bautista
3DPC
81
0
0
05 Dec 2024
TASR: Timestep-Aware Diffusion Model for Image Super-Resolution
Qinwei Lin
Xiaopeng Sun
Yu Gao
Yujie Zhong
Dengjie Li
Zheng Zhao
Haoqian Wang
76
0
0
04 Dec 2024
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
Gianni Franchi
Dat Nguyen Trong
Nacim Belkhir
Guoxuan Xia
Andrea Pilzer
UQLM
78
0
0
04 Dec 2024
UTSD: Unified Time Series Diffusion Model
Xiangkai Ma
Xiaobin Hong
Wenzhong Li
Sanglu Lu
82
0
0
04 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
64
1
0
03 Dec 2024
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance
Viet-Anh Nguyen
A. Nguyen
T. Dao
K. Nguyen
Cuong Pham
Toan M. Tran
Anh Tran
DiffM
79
1
0
03 Dec 2024
ScImage: How Good Are Multimodal Large Language Models at Scientific Text-to-Image Generation?
Leixin Zhang
Steffen Eger
Yinjie Cheng
Weihe Zhai
Jonas Belouadi
Christoph Leiter
Simone Paolo Ponzetto
Fahimeh Moafian
Zhixue Zhao
MLLM
96
1
0
03 Dec 2024
World-consistent Video Diffusion with Explicit 3D Modeling
Qihang Zhang
Shuangfei Zhai
Miguel Angel Bautista
Kevin Miao
Alexander Toshev
J. Susskind
Jiatao Gu
VGen
83
8
0
02 Dec 2024
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Sen Xing
Muyan Zhong
Zeqiang Lai
Liangchen Li
Jun Liu
Yaohui Wang
Jifeng Dai
Wenhai Wang
83
1
0
02 Dec 2024
IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models
Khaled Abud
Sergey Lavrushkin
Alexey Kirillov
D. Vatolin
94
0
0
02 Dec 2024
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Zichun Liao
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VGen
102
5
0
02 Dec 2024
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Anton Voronov
Denis Kuznedelev
Mikhail Khoroshikh
Valentin Khrulkov
Dmitry Baranchuk
111
2
0
02 Dec 2024
DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling
Xin Xie
Dong Gong
82
1
0
01 Dec 2024
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
Ruben Ohana
Michael McCabe
Lucas Meyer
Rudy Morel
Fruzsina J. Agocs
...
François Rozet
Liam Parker
M. Cranmer
S. Ho
Shirley Ho
PINN
AI4CE
74
8
1
30 Nov 2024
Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation
Michele De Vita
Vasileios Belagiannis
DiffM
93
1
0
29 Nov 2024
Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing
Wenyi Mo
Tianyu Zhang
Yalong Bai
Bing-Huang Su
Ji-Rong Wen
DiffM
76
0
0
29 Nov 2024
Open-Sora Plan: Open-Source Large Video Generation Model
Bin Lin
Yunyang Ge
Xinhua Cheng
Zongjian Li
Bin Zhu
...
Zhang Pan
Xing Zhou
Shaoling Dong
Yonghong Tian
Li-xin Yuan
VLM
VGen
118
60
0
28 Nov 2024
SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Rong-Cheng Tu
Wenhao Sun
Zhao Jin
Jingyi Liao
Jiaxing Huang
Dacheng Tao
VGen
DiffM
112
3
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
82
0
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
135
6
0
28 Nov 2024
Previous
1
2
3
...
9
10
11
...
15
16
17
Next