ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 814 papers shown
Title
PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment
PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment
Dingbang Huang
Wenbo Li
Yifei Zhao
Xinyu Pan
Yanhong Zeng
Bo Dai
DiffM
7
0
0
16 May 2025
Towards Self-Improvement of Diffusion Models via Group Preference Optimization
Towards Self-Improvement of Diffusion Models via Group Preference Optimization
Renjie Chen
Wenfeng Lin
Yichen Zhang
Jiangchuan Wei
Boyuan Liu
Chao Feng
Jiao Ran
Mingyu Guo
12
0
0
16 May 2025
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
Yuang Ai
Qihang Fan
Xuefeng Hu
Zhenheng Yang
Ran He
Huaibo Huang
DiffM
17
0
0
16 May 2025
DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning
DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning
Weilai Xiang
Hongyu Yang
Di Huang
Yunhong Wang
14
0
0
16 May 2025
CROC: Evaluating and Training T2I Metrics with Pseudo- and Human-Labeled Contrastive Robustness Checks
CROC: Evaluating and Training T2I Metrics with Pseudo- and Human-Labeled Contrastive Robustness Checks
Christoph Leiter
Yuki M. Asano
M. Keuper
Steffen Eger
17
0
0
16 May 2025
CompAlign: Improving Compositional Text-to-Image Generation with a Complex Benchmark and Fine-Grained Feedback
CompAlign: Improving Compositional Text-to-Image Generation with a Complex Benchmark and Fine-Grained Feedback
Yixin Wan
Kai-Wei Chang
EGVM
CoGe
25
0
0
16 May 2025
DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models
DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models
Giulia Bertazzini
Daniele Baracchi
Dasara Shullani
Isao Echizen
Alessandro Piva
24
0
0
16 May 2025
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
Yanbo Ding
Xirui Hu
Zhizhi Guo
Yixuan Wang
DiffM
VGen
33
0
0
15 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
32
0
0
15 May 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Bingda Tang
Boyang Zheng
Xichen Pan
Sayak Paul
Saining Xie
29
0
0
15 May 2025
Path Gradients after Flow Matching
Path Gradients after Flow Matching
Lorenz Vaitl
Leon Klein
14
0
0
15 May 2025
Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios
Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios
Huafeng Shi
Jianzhong Liang
Rongchang Xie
Xian Wu
Cheng Chen
Chang Liu
VGen
17
0
0
14 May 2025
Fast Text-to-Audio Generation with Adversarial Post-Training
Fast Text-to-Audio Generation with Adversarial Post-Training
Zachary Novack
Zach Evans
Zack Zukowski
Josiah Taylor
CJ Carr
...
Adnan Al-Sinan
Gian Marco Iodice
Julian McAuley
Taylor Berg-Kirkpatrick
Jordi Pons
30
0
0
13 May 2025
H$^{\mathbf{3}}$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
H3^{\mathbf{3}}3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xinyu Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
31
0
0
12 May 2025
FLUXSynID: A Framework for Identity-Controlled Synthetic Face Generation with Document and Live Images
FLUXSynID: A Framework for Identity-Controlled Synthetic Face Generation with Document and Live Images
Raul Ismayilov
Dzemila Sero
Luuk Spreeuwers
29
0
0
12 May 2025
DanceGRPO: Unleashing GRPO on Visual Generation
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue
Jie Wu
Yu Gao
Fangyuan Kong
Lingting Zhu
...
Zhiheng Liu
Wei Liu
Qiushan Guo
Weilin Huang
Ping Luo
EGVM
VGen
52
0
0
12 May 2025
Addressing degeneracies in latent interpolation for diffusion models
Addressing degeneracies in latent interpolation for diffusion models
Erik Landolsi
Fredrik Kahl
DiffM
45
0
0
12 May 2025
Improving Trajectory Stitching with Flow Models
Improving Trajectory Stitching with Flow Models
Reece O'Mahoney
Wanming Yu
Ioannis Havoutis
33
0
0
12 May 2025
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
Ozgur Kara
Krishna Kumar Singh
Feng Liu
Duygu Ceylan
James M. Rehg
Tobias Hinz
DiffM
VGen
41
0
0
12 May 2025
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
Hongkun Dou
Zeyu Li
Xingyu Jiang
Hao Li
Lijun Yang
Wen Yao
Yue Deng
DiffM
38
0
0
12 May 2025
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Zhiyuan Chen
Keyi Li
Yifan Jia
Le Ye
Yufei Ma
DiffM
35
0
0
09 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
28
0
0
09 May 2025
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
Moritz Vandenhirtz
Julia E. Vogt
38
0
0
09 May 2025
InstanceGen: Image Generation with Instance-level Instructions
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
33
0
0
08 May 2025
Flow-GRPO: Training Flow Matching Models via Online RL
Flow-GRPO: Training Flow Matching Models via Online RL
Jie Liu
Gongye Liu
Jiajun Liang
Yongqian Li
Jiaheng Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Wanli Ouyang
AI4CE
68
0
0
08 May 2025
Does CLIP perceive art the same way we do?
Does CLIP perceive art the same way we do?
Andrea Asperti
Leonardo Dessì
Maria Chiara Tonetti
Nico Wu
48
0
0
08 May 2025
MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation
MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation
Zilong Chen
Yikai Wang
Wenqiang Sun
Feng Wang
Yiwen Chen
Huaping Liu
34
0
0
07 May 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
34
0
0
07 May 2025
Defining and Quantifying Creative Behavior in Popular Image Generators
Defining and Quantifying Creative Behavior in Popular Image Generators
Aditi Ramaswamy
Hana Chockler
Melane Navaratnarajah
31
0
0
07 May 2025
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
57
0
0
07 May 2025
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
Shiyi Zhang
Junhao Zhuang
Zhaoyang Zhang
Ying Shan
Yansong Tang
VGen
107
0
0
06 May 2025
Distribution-Conditional Generation: From Class Distribution to Creative Generation
Distribution-Conditional Generation: From Class Distribution to Creative Generation
Fu Feng
Yucheng Xie
Xu Yang
Jing Wang
Xin Geng
DiffM
31
0
0
06 May 2025
FLUX-Text: A Simple and Advanced Diffusion Transformer Baseline for Scene Text Editing
FLUX-Text: A Simple and Advanced Diffusion Transformer Baseline for Scene Text Editing
Rui Lan
Y. Bai
Xu Duan
M. Li
Lei Sun
X. Chu
DiffM
140
0
0
06 May 2025
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Kapil Wanaskar
Gaytri Jena
Magdalini Eirinaki
EGVM
33
0
0
06 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Xuzhi Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
79
1
0
05 May 2025
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing
Zinan Guo
Pengze Zhang
Yanze Wu
Chong Mou
Mingcong Liu
Qian He
33
0
0
05 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
51
0
0
05 May 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
Ming Li
Xin Gu
Fan Chen
X. Xing
Longyin Wen
Cheng Chen
Sijie Zhu
DiffM
81
1
0
05 May 2025
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models
Yunfeng Ge
Jiawei Li
Yiji Zhao
Haomin Wen
Zhao Li
M. Qiu
Hao Li
Ming Jin
Shirui Pan
DiffM
143
0
0
05 May 2025
VSC: Visual Search Compositional Text-to-Image Diffusion Model
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Huu Dat
Nam Hyeonu
Po Yuan Mao
Tae-Hyun Oh
DiffM
CoGe
64
0
0
02 May 2025
Improving Editability in Image Generation with Layer-wise Memory
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim
Jaeah Lee
Jaesik Park
DiffM
KELM
60
0
0
02 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki
Qi Dai
Lee Hyoseok
Chong Luo
Tae-Hyun Oh
71
0
0
01 May 2025
Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields
Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields
Yixin Gao
Xiaohan Pan
X. Li
Zhibo Chen
51
0
0
30 Apr 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
90
0
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
78
0
0
30 Apr 2025
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing
Hong Zhang
Zhongjie Duan
Xingjun Wang
Yuze Zhao
Weiyi Lu
Zhipeng Di
Yongjun Xu
Yingda Chen
Yu Zhang
MLLM
94
1
0
30 Apr 2025
Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception
Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception
Yuanchen Wu
Lu Zhang
Hang Yao
Junlong Du
Ke Yan
Shouhong Ding
Yunsheng Wu
Xuzhao Li
MLLM
71
0
0
29 Apr 2025
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung
DiffM
AI4TS
53
0
0
29 Apr 2025
1234...151617
Next