Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.03206
Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"
50 / 818 papers shown
Title
HOFAR: High-Order Augmentation of Flow Autoregressive Transformers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
Mingda Wan
75
1
0
11 Mar 2025
Controlling Latent Diffusion Using Latent CLIP
Jason Becker
Chris Wendler
Peter Baylies
Robert West
Christian Wressnegger
DiffM
VLM
68
0
0
11 Mar 2025
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Hesen Chen
Junyan Wang
Zhiyu Tan
Hao Li
58
0
0
11 Mar 2025
Aligning Text to Image in Diffusion Models is Easier Than You Think
J. Lee
Byunghee Cha
Jeongsol Kim
Jong Chul Ye
52
0
0
11 Mar 2025
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution
X. Li
Jianlong Wu
Xinchuan Huang
C. L. Philip Chen
Weili Guan
Xian-Sheng Hua
Liqiang Nie
DiffM
56
0
0
11 Mar 2025
DiffEGG: Diffusion-Driven Edge Generation as a Pixel-Annotation-Free Alternative for Instance Annotation
Sanghyun Jo
Ziseok Lee
Wooyeol Lee
Kyungsu Kim
47
0
0
11 Mar 2025
NullFace: Training-Free Localized Face Anonymization
Han-Wei Kung
Tuomas Varanka
Terence Sim
N. Sebe
DiffM
PICV
68
0
0
11 Mar 2025
FP3: A 3D Foundation Policy for Robotic Manipulation
Rujia Yang
Geng Chen
Chuan Wen
Yang Gao
LM&Ro
81
1
0
11 Mar 2025
U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers
Zhanjie Zhang
Ao Ma
Ke Cao
Jing Wang
Shanyuan Liu
Yuhang Ma
Bo Cheng
Dawei Leng
Yuhui Yin
67
0
0
11 Mar 2025
Rethinking Diffusion Model in High Dimension
Zhenxin Zheng
Zhenjie Zheng
DiffM
51
0
0
11 Mar 2025
DreamRelation: Relation-Centric Video Customization
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Biao Gong
Longxiang Tang
...
Haonan Qiu
Hengjia Li
Shuai Tan
Yuyao Zhang
Hongming Shan
VGen
70
1
0
10 Mar 2025
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
Kwanyoung Kim
Byeongsu Sim
DiffM
VLM
58
0
0
10 Mar 2025
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation
Victor Shea-Jay Huang
Le Zhuo
Yi Xin
Zhaokai Wang
Peng Gao
Hongsheng Li
DiffM
48
1
0
10 Mar 2025
V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation
Guiwei Zhang
Tianyu Zhang
Mohan Zhou
Yalong Bai
Biye Li
66
0
0
10 Mar 2025
TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Shaobin Zhuang
Yiwei Guo
Yanbo Ding
Kunchang Li
Xinyuan Chen
Yaohui Wang
Fangyikang Wang
Ying Zhang
Chen Li
Yijiao Wang
45
0
0
10 Mar 2025
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
Shuhe Wang
Xiaoya Li
Jiwei Li
G. Wang
Xiaofei Sun
...
Han Qiu
Mo Yu
Shengjie Shen
Tianwei Zhang
Eduard H. Hovy
VLM
63
0
0
10 Mar 2025
LatexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
Jian Jin
Zhenbo Yu
Yang Shen
Zhenyong Fu
Jian Yang
DiffM
63
0
0
10 Mar 2025
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
Huiyang Shao
Xin Xia
Yanting Yang
Yuxi Ren
Xing Wang
Xuefeng Xiao
56
1
0
10 Mar 2025
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
Yuwei Niu
Munan Ning
Mengren Zheng
Bin Lin
Peng Jin
Jiaqi Liao
Kunpeng Ning
Bin Zhu
Li Yuan
EGVM
64
13
0
10 Mar 2025
SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models
Ouxiang Li
Yuan Wang
Xinting Hu
Houcheng Jiang
Tao Liang
Y. Hao
Guojun Ma
Fuli Feng
DiffM
49
1
0
10 Mar 2025
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model
Lixue Gong
Xiaoxia Hou
Fanshi Li
Liang Li
Xiaochen Lian
...
Qi Zhang
Yuwei Zhang
Shijia Zhao
Jianchao Yang
Weilin Huang
DiffM
VLM
63
6
0
10 Mar 2025
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clement Chadebec
O. Tasar
Sanjeev Sreetharan
Benjamin Aubin
42
0
0
10 Mar 2025
VACE: All-in-One Video Creation and Editing
Zeyinzi Jiang
Zhen Han
Chaojie Mao
J. Zhang
Yulin Pan
Yu Liu
DiffM
VGen
56
5
0
10 Mar 2025
Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
Ning Ding
Jing Han
Yuchuan Tian
Chao Xu
Kai Han
Yehui Tang
MQ
44
0
0
10 Mar 2025
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
Ruidong Chen
Honglin Guo
Lanjun Wang
Chenyu Zhang
Weizhi Nie
An-an Liu
DiffM
66
1
0
10 Mar 2025
PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation
Yanjie Pan
Q. He
Zhengkai Jiang
P. Xu
Chaoyi Wang
...
Yun Cao
Zhenye Gan
M. Chi
Bo Peng
Yishuo Wang
DiffM
66
1
0
09 Mar 2025
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
Xirui Hu
Jiahao Wang
Hao Chen
Weizhan Zhang
Benqi Wang
Yangfu Li
Haishun Nan
DiffM
67
0
0
09 Mar 2025
Conceptrol: Concept Control of Zero-shot Personalized Image Generation
Qiyuan He
Angela Yao
DiffM
41
0
0
09 Mar 2025
One-Step Diffusion Model for Image Motion-Deblurring
X. Liu
Yuquan Wang
Zhaoyu Chen
Jingyun Liang
He Zhang
Yuyao Zhang
Xiaokang Yang
DiffM
76
0
0
09 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
65
0
0
08 Mar 2025
PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
Xiang Gao
Shuai Yang
Jiaying Liu
DiffM
51
0
0
08 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
75
0
0
08 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
44
3
0
07 Mar 2025
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving
Zebin Xing
Xinsong Zhang
Yang Hu
Bo Jiang
Tong He
Qian Zhang
Xiaoxiao Long
Wei Yin
64
3
0
07 Mar 2025
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
Hongwei Yi
Tian Ye
Shitong Shao
Xuancheng Yang
Jiantong Zhao
...
Zeke Xie
Lei Zhu
Wei Li
Michael Lingelbach
Daquan Zhou
VGen
52
1
0
07 Mar 2025
LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang
Yaning Tan
Siyuan Liang
Zhaowei Chen
Linze Li
...
Shuheng Li
Zhenyu Zhao
Caihua Chen
Jiajun Liang
Yao Tang
51
0
0
06 Mar 2025
Synthetic Data is an Elegant GIFT for Continual Vision-Language Models
Bin Wu
Wuxuan Shi
Jinqiao Wang
Mang Ye
CLL
VLM
56
0
0
06 Mar 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao
Weijia Mao
Mike Zheng Shou
66
0
0
05 Mar 2025
All-atom Diffusion Transformers: Unified generative modelling of molecules and materials
Chaitanya K. Joshi
Xiang Fu
Yi-Lun Liao
Vahe Gharakhanyan
Benjamin Kurt Miller
Anuroop Sriram
Zachary W. Ulissi
DiffM
55
4
0
05 Mar 2025
Generative Modeling of Microweather Wind Velocities for Urban Air Mobility
Tristan A. Shah
Michael C. Stanley
James E. Warner
50
1
0
04 Mar 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
Zhen Yang
Guibao Shen
Liang Hou
Mushui Liu
Luozhou Wang
Xin Tao
Pengfei Wan
Di Zhang
Ying-cong Chen
DiffM
79
0
0
04 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
73
0
0
03 Mar 2025
Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior
Chen Guo
Junxuan Li
Yash Kant
Yaser Sheikh
Shunsuke Saito
Chen Cao
40
1
0
03 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Xuan Li
Jun Zhu
Qinsheng Zhang
DiffM
49
0
0
03 Mar 2025
MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation
Yi Wang
Mushui Liu
Wanggui He
Longxiang Zhang
Z. Huang
...
Yiming Li
Weilong Dai
Mingli Song
Jie Song
Hao Jiang
MLLM
MoE
LRM
83
1
0
03 Mar 2025
How simple can you go? An off-the-shelf transformer approach to molecular dynamics
Max Eissler
Tim Korjakow
Stefan Ganscha
Oliver T. Unke
Klaus-Robert Müller
Stefan Gugler
63
1
0
03 Mar 2025
Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection
Boyong He
Yuxiang Ji
Qianwen Ye
Zhuoyue Tan
Liaoni Wu
DiffM
68
0
0
03 Mar 2025
Proteina: Scaling Flow-based Protein Structure Generative Models
Tomas Geffner
Kieran Didi
Zuobai Zhang
Danny Reidenbach
Zhonglin Cao
...
Mario Geiger
Christian Dallago
E. Küçükbenli
Arash Vahdat
Karsten Kreis
DiffM
AI4CE
49
4
0
02 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
55
0
0
02 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
174
0
0
01 Mar 2025
Previous
1
2
3
...
6
7
8
...
15
16
17
Next