ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03206
  4. Cited By
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

5 March 2024
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
Harry Saini
Yam Levi
Dominik Lorenz
Axel Sauer
Frederic Boesel
Dustin Podell
Tim Dockhorn
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
    DiffM
ArXivPDFHTML

Papers citing "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis"

50 / 814 papers shown
Title
FlowR: Flowing from Sparse to Dense 3D Reconstructions
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
48
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
58
4
0
02 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang
Duo Peng
Feng Chen
Yuqing Yang
Yinjie Lei
DiffM
79
0
0
02 Apr 2025
SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning
SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning
Xiaole Xian
Zhichao Liao
Qingyu Li
Wenyu Qin
Pengfei Wan
Weicheng Xie
Long Zeng
L. Shen
Pingfa Feng
DiffM
61
0
0
01 Apr 2025
Distilling Multi-view Diffusion Models into 3D Generators
Distilling Multi-view Diffusion Models into 3D Generators
Hao Qin
Luyuan Chen
Ming Kong
Mengxu Lu
Qiang Zhu
3DGS
64
0
0
01 Apr 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
J. Wang
Tao Dai
Shu-Tao Xia
Luca Benini
72
2
0
30 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
43
0
0
30 Mar 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
40
0
0
30 Mar 2025
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
Zheng-Peng Duan
Jiawei Zhang
Xin Jin
Zhe Zhang
Zheng Xiong
Dongqing Zou
Jimmy S. Ren
Chun-Le Guo
Chongyi Li
42
0
0
30 Mar 2025
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
H. Seo
Junseo Bang
Haechang Lee
Joohoon Lee
Byung Hyun Lee
Se Young Chun
46
0
0
29 Mar 2025
Synthetic Art Generation and DeepFake Detection A Study on Jamini Roy Inspired Dataset
Synthetic Art Generation and DeepFake Detection A Study on Jamini Roy Inspired Dataset
Kushal Agrawal
Romi Banerjee
43
0
0
29 Mar 2025
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
Xianglong He
Junyi Chen
Di Huang
Zexiang Liu
Xiaoshui Huang
Wanli Ouyang
C. Yuan
Yangguang Li
DiffM
57
0
0
29 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
39
0
0
28 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
B. Kainz
MedIm
59
0
0
28 Mar 2025
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
H. Zhang
R. Su
Zhihang Yuan
Pengtao Chen
Mingzhu Shen Yibo Fan
Shengen Yan
Guohao Dai
Yu Wang
39
0
0
28 Mar 2025
Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets
Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets
Martin Kiss
Michal Hradiš
39
0
0
28 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
72
0
0
27 Mar 2025
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
Jike Zhong
Qilong Wu
Xinyue Li
Bo Zhang
Ming-xing Li
...
Hao Li
Yu Qiao
Peng Gao
Bin Fu
Zhen Li
EGVM
45
0
0
27 Mar 2025
Vision-to-Music Generation: A Survey
Vision-to-Music Generation: A Survey
Zhaokai Wang
Chenxi Bao
Le Zhuo
Jingrui Han
Yang Yue
Yihong Tang
Victor Shea-Jay Huang
Yue Liao
EGVM
VGen
74
1
0
27 Mar 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Yuyao Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
53
6
0
27 Mar 2025
Optimal Stepsize for Diffusion Sampling
Optimal Stepsize for Diffusion Sampling
Jianning Pei
Han Hu
Shuyang Gu
48
0
0
27 Mar 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
57
0
0
27 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
153
2
0
27 Mar 2025
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
Haoyu Zhao
Zhongang Qi
Cong Wang
Qingping Zheng
Guansong Lu
Fei Chen
Hang Xu
Zuxuan Wu
DiffM
VGen
48
0
0
27 Mar 2025
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
Zhiyuan Ma
Xinyue Liang
Rongyuan Wu
Xiangyu Zhu
Zhen Lei
Lei Zhang
73
0
0
27 Mar 2025
Can Video Diffusion Model Reconstruct 4D Geometry?
Can Video Diffusion Model Reconstruct 4D Geometry?
Jinjie Mai
Wenxuan Zhu
Haozhe Liu
Bing Li
Cheng Zheng
Jürgen Schmidhuber
Bernard Ghanem
VGen
MDE
74
0
0
27 Mar 2025
Video Motion Graphs
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
90
0
0
26 Mar 2025
Unified Multimodal Discrete Diffusion
Unified Multimodal Discrete Diffusion
Alexander Swerdlow
Mihir Prabhudesai
Siddharth Gandhi
Deepak Pathak
Katerina Fragkiadaki
DiffM
77
0
0
26 Mar 2025
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Forensic Self-Descriptions Are All You Need for Zero-Shot Detection, Open-Set Source Attribution, and Clustering of AI-generated Images
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
46
1
0
26 Mar 2025
EditCLIP: Representation Learning for Image Editing
EditCLIP: Representation Learning for Image Editing
Qian Wang
Aleksandar Cvejic
Abdelrahman Eldesokey
Peter Wonka
67
0
0
26 Mar 2025
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
Yuyang Peng
Shishi Xiao
Keming Wu
Qisheng Liao
Bohan Chen
Kevin Lin
Danqing Huang
Ji Li
Yuhui Yuan
DiffM
79
1
0
26 Mar 2025
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models
Alex Jinpeng Wang
Linjie Li
Z. Yang
Lijuan Wang
Min Li
DiffM
73
0
0
26 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
47
2
0
26 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
49
0
0
25 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
84
2
0
25 Mar 2025
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
Jiaqi Liao
Z. Yang
Linjie Li
Dianqi Li
Kevin Qinghong Lin
Yu-Xi Cheng
Lijuan Wang
MLLM
LRM
62
0
0
25 Mar 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
54
1
0
25 Mar 2025
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang
Jinghao Li
Yu-Wing Tai
DiffM
64
0
0
25 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
J. Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
44
0
0
25 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Zhiqiang Zhang
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
73
1
0
25 Mar 2025
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
FullDiT: Multi-Task Video Generative Foundation Model with Full Attention
Xuan Ju
Weicai Ye
Quande Liu
Qiulin Wang
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Qiang Xu
VGen
46
1
0
25 Mar 2025
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
IPGO: Indirect Prompt Gradient Optimization for Parameter-Efficient Prompt-level Fine-Tuning on Text-to-Image Models
Jianping Ye
Michel Wedel
Kunpeng Zhang
39
0
0
25 Mar 2025
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen
Misha Sra
Pradeep Sen
55
0
0
24 Mar 2025
U-REPA: Aligning Diffusion U-Nets to ViTs
U-REPA: Aligning Diffusion U-Nets to ViTs
Yuchuan Tian
Hanting Chen
Mengyu Zheng
Yuchen Liang
Chao Xu
Yunhe Wang
56
0
0
24 Mar 2025
Panorama Generation From NFoV Image Done Right
Panorama Generation From NFoV Image Done Right
Dian Zheng
Cheng Zhang
Xiao-Ming Wu
Cao Li
Chengfei Lv
Jian-Fang Hu
Wei-Shi Zheng
DiffM
81
0
0
24 Mar 2025
Target-Aware Video Diffusion Models
Target-Aware Video Diffusion Models
Taeksoo Kim
Hanbyul Joo
DiffM
VGen
91
1
0
24 Mar 2025
RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis
RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis
Yifei Feng
M. Yang
Steve Yang
Sheng Zhang
J. Yu
Zibo Zhao
Yuhong Liu
Jie Jiang
Chunchao Guo
DiffM
61
0
0
24 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
59
2
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
60
0
0
24 Mar 2025
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
Weichen Fan
Amber Yijia Zheng
Raymond A. Yeh
Ziwei Liu
55
1
0
24 Mar 2025
Previous
12345...151617
Next