Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.00426
Cited By
PixArt-
α
α
α
: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
30 September 2023
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
Yue Wu
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis"
50 / 86 papers shown
Title
DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models
Giulia Bertazzini
Daniele Baracchi
Dasara Shullani
Isao Echizen
Alessandro Piva
27
0
0
16 May 2025
Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios
Huafeng Shi
Jianzhong Liang
Rongchang Xie
Xian Wu
Cheng Chen
Chang Liu
VGen
22
0
0
14 May 2025
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
43
1
0
12 May 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
36
0
0
07 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
51
0
0
05 May 2025
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim
Jaeah Lee
Jaesik Park
DiffM
KELM
60
0
0
02 May 2025
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki
Qi Dai
Lee Hyoseok
Chong Luo
Tae-Hyun Oh
71
0
0
01 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
Yiming Li
LRM
72
2
0
01 May 2025
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Zechuan Zhang
Ji Xie
Yu Lu
Zongxin Yang
Yuqing Yang
DiffM
97
1
0
29 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Qi Dai
Jaesik Park
Chong Luo
DiffM
VGen
115
0
0
23 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
45
0
0
14 Apr 2025
Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization
Gen Li
Yang Xiao
Jie Ji
Kaiyuan Deng
Bo Hui
Linke Guo
Xiaolong Ma
24
0
0
12 Apr 2025
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Guangcong Zheng
Teng Li
Xianpan Zhou
Xi Li
VGen
3DV
69
1
0
11 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
76
0
0
10 Apr 2025
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Jiazi Bu
Pengyang Ling
Yujie Zhou
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Y. Cao
Dahua Lin
Jiaqi Wang
23
0
0
08 Apr 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
89
0
0
26 Mar 2025
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
Fanghua Yu
Jinjin Gu
Jinfan Hu
Zheyuan Li
Chao Dong
DiffM
55
0
0
21 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
74
0
0
20 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
71
0
0
13 Mar 2025
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Junsong Chen
Shuchen Xue
Yuyang Zhao
Jincheng Yu
Sayak Paul
Junyu Chen
Han Cai
E. Xie
Enze Xie
VLM
66
2
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
163
0
0
12 Mar 2025
LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang
Yaning Tan
Siyuan Liang
Zhaowei Chen
Linze Li
...
Shuheng Li
Zhenyu Zhao
Caihua Chen
Jiajun Liang
Yao Tang
51
0
0
06 Mar 2025
Zero-Shot Head Swapping in Real-World Scenarios
S. Jeong
Taewoong Kang
Hyojin Jang
Jaegul Choo
39
0
0
02 Mar 2025
Accelerating Diffusion Transformers with Token-wise Feature Caching
Chang Zou
Xuyang Liu
Ting Liu
Siteng Huang
Linfeng Zhang
54
14
0
20 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
77
0
0
18 Feb 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Guoqing Ma
Haoyang Huang
K. Yan
L. Chen
Nan Duan
...
Yibo Wang
Yuanwei Lu
Yu-Cheng Chen
Yu-Juan Luo
Y. Luo
DiffM
VGen
175
17
0
14 Feb 2025
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
Zhenxing Mi
Kuan-Chieh Jackson Wang
Guocheng Qian
Hanrong Ye
Runtao Liu
Sergey Tulyakov
Kfir Aberman
Dan Xu
LRM
47
0
0
12 Feb 2025
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Chenguo Lin
Panwang Pan
Bangbang Yang
Zeming Li
Yadong Mu
3DGS
76
7
0
28 Jan 2025
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
M. Gwilliam
Han Cai
Di Wu
Abhinav Shrivastava
Zhiyu Cheng
90
0
0
22 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Liang Feng
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
152
0
0
21 Jan 2025
EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Jaehoon Heo
Adiwena Putra
Jieon Yoon
Sungwoong Yune
Hangyeol Lee
Ji-Hoon Kim
Joo-Young Kim
DiffM
55
1
0
10 Jan 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
64
2
0
08 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
J. Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
48
15
0
05 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
124
2
0
03 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
32
0
0
02 Jan 2025
Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders
Rui Chen
Jianfeng Zhang
Yixun Liang
Guan Luo
Weiyu Li
Jiarui Liu
Xiu Li
Xiaoxiao Long
Jiashi Feng
P. Tan
76
11
0
23 Dec 2024
VidTwin: Video VAE with Decoupled Structure and Dynamics
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Jiang Bian
DRL
VGen
77
3
0
23 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
120
9
0
19 Dec 2024
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu
Huiyu Duan
Qiang Hu
Liu Yang
Chunlei Cai
Tianxiao Ye
Huayu Liu
Xiaoyun Zhang
Guangtao Zhai
EGVM
97
1
0
17 Dec 2024
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
99
1
0
16 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
128
11
0
16 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
196
2
0
14 Dec 2024
Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models
Andreas Müller
Denis Lukovnikov
Jonas Thietke
Asja Fischer
Erwin Quiring
AAML
WIGM
177
4
0
04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-jun Qi
DiffM
83
5
0
02 Dec 2024
SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis
Hyojun Go
Byeongjun Park
Jiho Jang
Jin-Young Kim
Soonwoo Kwon
Changick Kim
3DGS
116
2
0
25 Nov 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
57
2
0
05 Nov 2024
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Arash Marioriyad
Parham Rezaei
M. Baghshah
M. Rohban
CoGe
142
0
0
30 Oct 2024
Progressive Compositionality in Text-to-Image Generative Models
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
106
2
0
22 Oct 2024
1
2
Next