Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 814 papers shown
Title
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
Tianhui Song
Weixin Feng
Shuai Wang
X. Li
Tiezheng Ge
Bo Zheng
Limin Wang
MoMe
62
0
0
16 Apr 2025
ADT: Tuning Diffusion Models with Adversarial Supervision
Dazhong Shen
Guanglu Song
Yang Zhang
Bingqi Ma
Lujundong Li
D. Jiang
Zhuofan Zong
Y. Liu
DiffM
50
0
0
15 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
Xinyu Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
49
4
0
15 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
67
0
0
15 Apr 2025
Efficient Generative Model Training via Embedded Representation Warmup
Deyuan Liu
Peng Sun
Xufeng Li
Tao Lin
33
0
0
14 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
45
0
0
14 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Anil Kag
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
39
0
0
14 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffM
VGen
23
0
0
14 Apr 2025
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise
Chao Liu
Arash Vahdat
DiffM
VGen
44
0
0
14 Apr 2025
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
Xingjian Leng
Jaskirat Singh
Yunzhong Hou
Zhenchang Xing
Saining Xie
Liang Zheng
39
0
0
14 Apr 2025
D
2
^2
2
iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
29
0
0
13 Apr 2025
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning
Shengao Wang
Arjun Chandra
Aoming Liu
Venkatesh Saligrama
Boqing Gong
MLLM
VLM
47
0
0
13 Apr 2025
Head-Aware KV Cache Compression for Efficient Visual Autoregressive Modeling
Ziran Qin
Youru Lv
Mingbao Lin
Zeren Zhang
Danping Zou
Weiyao Lin
VLM
43
0
0
12 Apr 2025
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
Shri Kiran Srinivasan
Yu Zhang
Chen Wu
Dianjie Lu
Dianjie Lu
Guijuan Zhan
Yang Weng
Zhuoran Zheng
DiffM
VGen
28
0
0
12 Apr 2025
Flux Already Knows -- Activating Subject-Driven Image Generation without Training
Hao Kang
Stathi Fotiadis
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Min Jin Chong
Xin Lu
40
0
0
12 Apr 2025
MixDiT: Accelerating Image Diffusion Transformer Inference with Mixed-Precision MX Quantization
Daeun Kim
Jinwoo Hwang
Changhun Oh
Jongse Park
MQ
37
0
0
11 Apr 2025
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang
Sergio Sancho
Jingwei Tang
Markus Gross
Vinicius Azevedo
30
0
0
11 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
93
1
0
11 Apr 2025
DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows
Mashrur M. Morshed
Vishnu Boddeti
38
0
0
10 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
37
0
0
10 Apr 2025
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
Zhong-Yu Li
Ruoyi Du
Juncheng Yan
Le Zhuo
Zhen Li
Peng Gao
Zhanyu Ma
Ming-Ming Cheng
VLM
72
2
0
10 Apr 2025
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Jonas Loos
Lorenz Linhardt
26
0
0
09 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
71
0
0
09 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
47
0
0
09 Apr 2025
EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation
Diljeet Jagpal
Xi Chen
Vinay P. Namboodiri
DiffM
VGen
51
0
0
09 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
T. Ge
Bo Zheng
Hongtao Xie
DiffM
40
0
0
09 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
24
0
0
08 Apr 2025
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
49
6
0
08 Apr 2025
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu
Pengyang Ling
Yujie Zhou
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Y. Cao
Dahua Lin
Jiaqi Wang
21
0
0
08 Apr 2025
Flash Sculptor: Modular 3D Worlds from Objects
Yujia Hu
Songhua Liu
Xingyi Yang
Xinchao Wang
34
0
0
08 Apr 2025
CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models
Kavana Venkatesh
Connor Dunlop
Pinar Yanardag
DiffM
40
0
0
07 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Yu Qiao
Chao Dong
Yihao Liu
MLLM
66
1
0
07 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffM
VGen
33
0
0
07 Apr 2025
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas J. Guibas
Gordon Wetzstein
Sai Bi
DiffM
63
0
0
07 Apr 2025
Studying Image Diffusion Features for Zero-Shot Video Object Segmentation
Thanos Delatolas
Vicky S. Kalogeiton
Dim P. Papadopoulos
DiffM
VOS
48
1
0
07 Apr 2025
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Hui Liu
Bin Zou
Suiyun Zhang
Kecheng Chen
Rui Liu
Haoliang Li
DiffM
68
0
0
07 Apr 2025
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
Yang Jiao
Haibo Qiu
Zequn Jie
S. Chen
Jingjing Chen
Lin Ma
Yu Jiang
34
2
0
06 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
23
0
0
05 Apr 2025
SDEIT: Semantic-Driven Electrical Impedance Tomography
Dong Liu
Yuanchao Wu
Bowen Tong
Jiansong Deng
DiffM
30
0
0
05 Apr 2025
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Ved Umrajkar
Aakash Kumar Singh
23
0
0
04 Apr 2025
Conditioning Diffusions Using Malliavin Calculus
Jakiw Pidstrigach
Elizabeth Baker
Carles Domingo-Enrich
George Deligiannidis
Nikolas Nüsken
DiffM
35
0
0
04 Apr 2025
Generating ensembles of spatially-coherent in-situ forecasts using flow matching
David Landry
C. Monteleoni
A. Charantonis
68
0
0
04 Apr 2025
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication
Zhongjian Wang
Peng Zhang
Jinwei Qi
Guangyuan Wang Sheng Xu
Bang Zhang
Liefeng Bo
DiffM
VGen
38
0
0
03 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
71
2
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
55
4
0
02 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
48
0
0
02 Apr 2025
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Jincheng Zhong
Xiangcheng Zhang
J. Z. Wang
Mingsheng Long
38
1
0
02 Apr 2025
WorldPrompter: Traversable Text-to-Scene Generation
Zhaoyang Zhang
Yannick Hold-Geoffroy
Miloš Hašan
Chen Ziwen
Fujun Luan
Julie Dorsey
Yiwei Hu
VGen
50
0
0
02 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
31
0
0
02 Apr 2025
Previous
1
2
3
4
5
6
...
15
16
17
Next