ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.12152
  4. Cited By
All are Worth Words: A ViT Backbone for Diffusion Models

All are Worth Words: A ViT Backbone for Diffusion Models

25 September 2022
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
    VLM
ArXivPDFHTML

Papers citing "All are Worth Words: A ViT Backbone for Diffusion Models"

50 / 115 papers shown
Title
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
N. Benjamin Erichson
Vinicius Mikuni
Dongwei Lyu
Yang Gao
Omri Azencot
Soon Hoe Lim
Michael W. Mahoney
AI4CE
858
0
0
23 May 2025
Swin DiT: Diffusion Transformer using Pseudo Shifted Windows
Swin DiT: Diffusion Transformer using Pseudo Shifted Windows
Jiafu Wu
Yabiao Wang
Jian Li
Jinlong Peng
Yun Cao
Chengjie Wang
Jiangning Zhang
180
0
0
19 May 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
136
2
0
24 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
112
1
0
18 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
124
0
0
09 Apr 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-Jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Yue Yang
173
2
0
16 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
113
1
0
13 Mar 2025
Rethinking Diffusion Model in High Dimension
Rethinking Diffusion Model in High Dimension
Zhenxin Zheng
Zhenjie Zheng
DiffM
82
0
0
11 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi Han
Yandong Tang
Liangqiong Qu
86
0
0
10 Mar 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
152
9
0
27 Feb 2025
DiffGuard: Text-Based Safety Checker for Diffusion Models
DiffGuard: Text-Based Safety Checker for Diffusion Models
Massine El Khader
Elias Al Bouzidi
Abdellah Oumida
Mohammed Sbaihi
Eliott Binard
Jean-Philippe Poli
Wassila Ouerdane
Boussad Addad
Katarzyna Kapusta
DiffM
180
0
0
20 Feb 2025
Large Language Diffusion Models
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
185
47
0
14 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
159
14
0
10 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
Jun Zhu
132
2
0
28 Jan 2025
Robust Representation Consistency Model via Contrastive Denoising
Robust Representation Consistency Model via Contrastive Denoising
Jiachen Lei
Julius Berner
Jiongxiao Wang
Zhongzhu Chen
Zhongjia Ba
Kui Ren
Jun Zhu
Anima Anandkumar
DiffM
118
0
0
22 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
106
7
0
13 Jan 2025
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
97
0
0
03 Jan 2025
DiC: Rethinking Conv3x3 Designs in Diffusion Models
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian
Jing Han
Chengcheng Wang
Yuchen Liang
Chao Xu
Hanting Chen
DiffM
97
2
0
31 Dec 2024
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
329
3
0
14 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hong Chen
Zihan Wang
Xianrui Li
Xingwu Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
171
8
0
14 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
136
3
0
11 Dec 2024
Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Rongkun Xue
Jinouwen Zhang
Yazhe Niu
Dazhong Shen
Bingqi Ma
Yu Liu
Jing Yang
129
0
0
29 Nov 2024
More Expressive Attention with Negative Weights
More Expressive Attention with Negative Weights
Ang Lv
Ruobing Xie
Shuaipeng Li
Jiayi Liao
Xingwu Sun
Zhanhui Kang
Di Wang
Rui Yan
78
1
0
11 Nov 2024
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation
Hao Phung
Quan Dao
T. Dao
Hoang Phan
Dimitris Metaxas
Anh Tran
Mamba
116
5
0
06 Nov 2024
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Emiel Hoogeboom
Thomas Mensink
Jonathan Heek
Kay Lamerigts
Ruiqi Gao
Tim Salimans
391
9
0
25 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
112
2
0
16 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
96
18
0
10 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
106
111
0
10 Oct 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
138
85
0
09 Oct 2024
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang
Jia Wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
131
31
0
03 Oct 2024
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
X. Wang
Siming Fu
Qihan Huang
Wanggui He
Hao Jiang
DiffM
79
52
0
11 Jun 2024
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
Eduard Poesina
Adriana Valentina Costache
Adrian-Gabriel Chifu
Josiane Mothe
Radu Tudor Ionescu
VLM
106
1
0
07 Jun 2024
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Jingyang Ou
Shen Nie
Kaiwen Xue
Fengqi Zhu
Jiacheng Sun
Zhenguo Li
Chongxuan Li
DiffM
97
50
0
06 Jun 2024
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
Fang Chen
Gourav Datta
Mujahid Al Rafi
Hyeran Jeon
Meng Tang
131
1
0
06 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQ
VGen
164
33
0
04 Jun 2024
SparseDM: Toward Sparse Efficient Diffusion Models
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang
Jianfei Chen
He Li
Zhenpeng Mi
Jun-Jie Zhu
DiffM
109
10
0
16 Apr 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
193
269
0
05 Jan 2024
Application-driven Validation of Posteriors in Inverse Problems
Application-driven Validation of Posteriors in Inverse Problems
T. Adler
Jan-Hinrich Nolke
Annika Reinke
M. Tizabi
Sebastian Gruber
...
Lynton Ardizzone
Paul F. Jaeger
Florian Buettner
Ullrich Kothe
Lena Maier-Hein
MedIm
61
1
0
18 Sep 2023
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
105
1,528
0
05 Oct 2022
Equivariant Energy-Guided SDE for Inverse Molecular Design
Equivariant Energy-Guided SDE for Inverse Molecular Design
Fan Bao
Min Zhao
Zhongkai Hao
Pei‐Yun Li
Chongxuan Li
Jun Zhu
DiffM
218
67
0
30 Sep 2022
DreamFusion: Text-to-3D using 2D Diffusion
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
124
2,407
0
29 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
154
173
0
29 Sep 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
63
94
0
29 Aug 2022
Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model
Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model
Xiulong Yang
Sheng-Min Shih
Yinlin Fu
Xiaoting Zhao
Shihao Ji
DiffM
52
56
0
16 Aug 2022
Analog Bits: Generating Discrete Data using Diffusion Models with
  Self-Conditioning
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Ting-Li Chen
Ruixiang Zhang
Geoffrey E. Hinton
DiffM
74
306
0
08 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
150
1,765
0
02 Aug 2022
Classifier-Free Diffusion Guidance
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
173
3,882
0
26 Jul 2022
EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic
  Differential Equations
EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations
Min Zhao
Fan Bao
Chongxuan Li
Jun Zhu
DiffM
91
194
0
14 Jul 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
178
1,114
0
22 Jun 2022
Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order
  Denoising Score Matching
Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching
Cheng Lu
Kaiwen Zheng
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
82
85
0
16 Jun 2022
123
Next