ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 814 papers shown
Title
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent
Kyle Hsu
Justin Johnson
L. Fei-Fei
Jiajun Wu
DiffM
MU
58
3
0
14 Mar 2025
Pathology Image Compression with Pre-trained Autoencoders
Srikar Yellapragada
Alexandros Graikos
Kostas Triaridis
Zilinghan Li
Tarak Nandi
Ravi K. Madduri
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
MedIm
36
0
0
14 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
57
0
0
14 Mar 2025
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Hao-Ran Cheng
Erjia Xiao
Yichi Wang
Kaidi Xu
Mengshu Sun
Jindong Gu
Renjing Xu
41
0
0
14 Mar 2025
Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction
Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction
Da Long
Shandian Zhe
Samuel Williams
L. Oliker
Zhe Bai
AI4TS
AI4CE
44
0
0
14 Mar 2025
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Ruchika Chavhan
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
Luca Morreale
Mehdi Noroozi
Sourav Bhattacharya
49
0
0
14 Mar 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation
Sungwoo Cho
J. Choi
Sungnyun Kim
Se-Young Yun
63
0
0
14 Mar 2025
Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Yi Wu
Lingting Zhu
Lei Liu
Wandi Qiao
Ziqiang Li
Lequan Yu
Bin Li
DiffM
52
0
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
102
5
0
13 Mar 2025
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
Bhavik Chandna
Mariam Aboujenane
Usman Naseem
60
0
0
13 Mar 2025
The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation
Ho Kei Cheng
Alexander Schwing
OT
74
0
0
13 Mar 2025
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo
Zeyu Hu
Na Zhao
De Wen Soh
VGen
94
2
0
13 Mar 2025
VideoMerge: Towards Training-free Long Video Generation
Siyang Zhang
Harry Yang
Ser-Nam Lim
DiffM
VGen
50
0
0
13 Mar 2025
Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark
Viktor Moskvoretskii
Alina Lobanova
Ekaterina Neminova
Chris Biemann
Alexander Panchenko
Irina Nikishina
47
0
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
76
2
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yunhong Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
72
0
0
13 Mar 2025
R^RRFLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
45
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
112
5
0
13 Mar 2025
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
Yongsheng Yu
Ziyun Zeng
Haitian Zheng
Jiebo Luo
DiffM
59
0
0
13 Mar 2025
Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models
Sina Malakouti
Adriana Kovashka
EGVM
69
0
0
13 Mar 2025
Probabilistic Forecasting via Autoregressive Flow Matching
Ahmed El-Gazzar
Marcel van Gerven
AI4TS
57
0
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
54
0
0
13 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
J. Wang
Ziwei Liu
Koike Hideki
VGen
61
0
0
13 Mar 2025
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
Chen Chen
Rui Qian
Wenze Hu
Tsu-jui Fu
Jialing Tong
...
Lezhi Li
Bowen Zhang
A. Schwing
Wei Liu
Yuqing Yang
64
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
75
0
0
13 Mar 2025
MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation
Yuxiang Fu
Qi Yan
Lele Wang
Ke Li
Renjie Liao
AI4TS
44
1
0
13 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
71
0
0
13 Mar 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
59
0
0
13 Mar 2025
AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption
Joonsung Jeon
Woo Jae Kim
Suhyeon Ha
Sooel Son
Sung-eui Yoon
DiffM
AAML
54
0
0
13 Mar 2025
NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers
Yuhang Ma
Bo Cheng
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
60
0
0
12 Mar 2025
Leveraging Semantic Attribute Binding for Free-Lunch Color Control in Diffusion Models
Héctor Laria
Alexandra Gomez-Villa
Jiang Qin
Muhammad Atif Butt
Bogdan Raducanu
Javier Vázquez-Corral
J. Weijer
Kai Wang
DiffM
65
0
0
12 Mar 2025
Unified Dense Prediction of Video Diffusion
Lehan Yang
Lu Qi
Xianrui Li
Sheng Li
Varun Jampani
Ming Yang
MDE
VOS
VGen
63
0
0
12 Mar 2025
Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space
Yifan Zhou
Zeqi Xiao
Shuai Yang
Xingang Pan
69
2
0
12 Mar 2025
I2V3D: Controllable image-to-video generation with 3D guidance
Zhiyuan Zhang
Dongdong Chen
J. Liao
VGen
55
0
0
12 Mar 2025
Zero-Shot Subject-Centric Generation for Creative Application Using Entropy Fusion
Kaifeng Zou
Xiaoyi Feng
Peng Wang
Tao Huang
Zizhou Huang
Zhang Haihang
Yuntao Zou
Dagang Li
DiffM
51
0
0
12 Mar 2025
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
Nikolai Korber
Eduard Kromer
Andreas Siebert
S. Hauke
Daniel Mueller-Gritschneder
Björn Schuller
56
0
0
12 Mar 2025
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Junsong Chen
Shuchen Xue
Yuyang Zhao
Jincheng Yu
Sayak Paul
Junyu Chen
Han Cai
E. Xie
Enze Xie
VLM
66
2
0
12 Mar 2025
UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Haoxuan Wang
Jinlong Peng
Q. He
Hao Yang
Ying Jin
...
Yanjie Pan
Zhenye Gan
M. Chi
Bo Peng
Yishuo Wang
DiffM
60
0
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
163
0
0
12 Mar 2025
Exploring Bias in over 100 Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Saeed Mian
EGVM
67
3
0
11 Mar 2025
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Hesen Chen
Junyan Wang
Zhiyu Tan
Hao Li
58
0
0
11 Mar 2025
V2M4: 4D Mesh Animation Reconstruction from a Single Monocular Video
Jianqi Chen
Biao Zhang
Xiangjun Tang
Peter Wonka
VGen
57
0
0
11 Mar 2025
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
Mamba
MLLM
82
1
0
11 Mar 2025
Aligning Text to Image in Diffusion Models is Easier Than You Think
Aligning Text to Image in Diffusion Models is Easier Than You Think
J. Lee
Byunghee Cha
Jeongsol Kim
Jong Chul Ye
52
0
0
11 Mar 2025
Controlling Latent Diffusion Using Latent CLIP
Jason Becker
Chris Wendler
Peter Baylies
Robert West
Christian Wressnegger
DiffM
VLM
65
0
0
11 Mar 2025
OminiControl2: Efficient Conditioning for Diffusion Transformers
Zhenxiong Tan
Qiaochu Xue
Xingyi Yang
Songhua Liu
Xinchao Wang
DiffM
50
0
0
11 Mar 2025
FP3: A 3D Foundation Policy for Robotic Manipulation
Rujia Yang
Geng Chen
Chuan Wen
Yang Gao
LM&Ro
78
1
0
11 Mar 2025
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution
X. Li
Jianlong Wu
Xinchuan Huang
C. L. Philip Chen
Weili Guan
Xian-Sheng Hua
Liqiang Nie
DiffM
56
0
0
11 Mar 2025
HOFAR: High-Order Augmentation of Flow Autoregressive Transformers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Mingda Wan
75
1
0
11 Mar 2025
Previous
123...567...151617
Next