Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.14797
Cited By
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
22 February 2024
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
Anil Kag
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis"
50 / 72 papers shown
Title
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffM
VGen
93
0
0
18 May 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
...
Zelin Peng
Junjun He
Junjun He
Zongyuan Ge
Imran Razzak
DiffM
VGen
164
2
0
20 Mar 2025
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming-Hsuan Yang
Sergey Tulyakov
DiffM
VGen
107
7
0
10 Jan 2025
AKiRa: Augmentation Kit on Rays for optical video generation
Xi Wang
Robin Courant
Marc Christie
Vicky Kalogeiton
VGen
125
3
0
31 Dec 2024
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
160
12
0
16 Dec 2024
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Sherwin Bahmani
Ivan Skorokhodov
Aliaksandr Siarohin
Willi Menapace
Guocheng Qian
...
Chaoyang Wang
Jiaxu Zou
Andrea Tagliasacchi
David B. Lindell
Sergey Tulyakov
VGen
DiffM
145
42
0
17 Jul 2024
Matryoshka Diffusion Models
Jiatao Gu
Shuangfei Zhai
Yizhen Zhang
Joshua M. Susskind
Navdeep Jaitly
DiffM
53
45
0
23 Oct 2023
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Yaofang Liu
Xiaodong Cun
Xuebo Liu
Xintao Wang
Yong Zhang
Haoxin Chen
Yang Liu
Tieyong Zeng
Raymond H. F. Chan
Ying Shan
VGen
EGVM
41
131
0
17 Oct 2023
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis
Jiayan Teng
Wendi Zheng
Ming Ding
Wenyi Hong
Jianqiao Wangni
Zhuoyi Yang
Jie Tang
DiffM
49
42
0
04 Sep 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
51
809
0
10 Jul 2023
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
Yanyu Li
Huan Wang
Qing Jin
Ju Hu
Pavlo Chemerys
Yun Fu
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VLM
46
153
0
01 Jun 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
55
12
0
22 May 2023
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang
Huan Yang
Zixi Tuo
Huiguo He
Sitong Su
Jianlong Fu
Jiaying Liu
DiffM
VGen
80
114
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yuan Liu
Yogesh Balaji
DiffM
VGen
64
257
0
17 May 2023
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
154
1,044
0
18 Apr 2023
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Jie An
Songyang Zhang
Harry Yang
Sonal Gupta
Jia-Bin Huang
Jiebo Luo
Xiaoyue Yin
DiffM
VGen
46
109
0
17 Apr 2023
MoStGAN-V: Video Generation with Temporal Motion Styles
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VGen
41
30
0
05 Apr 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Sheng-Siang Yin
Chenfei Wu
Huan Yang
Jianfeng Wang
Xiaodong Wang
...
Gong Ming
Lijuan Wang
Zicheng Liu
Houqiang Li
Nan Duan
VGen
31
129
0
22 Mar 2023
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
Chenyang Qi
Xiaodong Cun
Yong Zhang
Chenyang Lei
Xintao Wang
Ying Shan
Qifeng Chen
VGen
56
336
0
16 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
149
314
0
15 Mar 2023
Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
DiffM
VGen
125
517
0
06 Feb 2023
Simple diffusion: End-to-end diffusion for high resolution images
Emiel Hoogeboom
Jonathan Heek
Tim Salimans
62
253
0
26 Jan 2023
On the Importance of Noise Scheduling for Diffusion Models
Ting Chen
DiffM
36
151
0
26 Jan 2023
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
47
234
0
10 Dec 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffM
VGen
34
213
0
23 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
59
380
0
20 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
135
1,745
0
17 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
109
811
0
02 Nov 2022
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation
Jiatao Gu
Shuangfei Zhai
Yizhe Zhang
Miguel Angel Bautista
J. Susskind
DiffM
74
26
0
10 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
92
381
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
88
1,501
0
05 Oct 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
61
1,373
0
29 Sep 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
169
2,789
0
25 Aug 2022
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
49
3,786
0
26 Jul 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
153
1,089
0
22 Jun 2022
Generating Long Videos of Dynamic Scenes
Tim Brooks
Janne Hellsten
M. Aittala
Ting-Chun Wang
Timo Aila
J. Lehtinen
Xuan Li
Alexei A. Efros
Tero Karras
SyDa
14
104
0
07 Jun 2022
Elucidating the Design Space of Diffusion-Based Generative Models
Tero Karras
M. Aittala
Timo Aila
S. Laine
DiffM
117
1,907
0
01 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
270
585
0
29 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
223
5,904
0
23 May 2022
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
72
215
0
07 Apr 2022
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
117
1,563
0
07 Apr 2022
The Role of ImageNet Classes in Fréchet Inception Distance
Tuomas Kynkaanniemi
Tero Karras
M. Aittala
Timo Aila
J. Lehtinen
EGVM
VLM
63
201
0
11 Mar 2022
Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
Sihyun Yu
Jihoon Tack
Sangwoo Mo
Hyunsu Kim
Junho Kim
Jung-Woo Ha
Jinwoo Shin
DiffM
VGen
61
200
0
21 Feb 2022
StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Ivan Skorokhodov
Sergey Tulyakov
Mohamed Elhoseiny
VGen
57
283
0
29 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
199
15,081
0
20 Dec 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViT
VGen
42
294
0
24 Nov 2021
CCVS: Context-aware Controllable Video Synthesis
G. L. Moing
Jean Ponce
Cordelia Schmid
44
78
0
16 Jul 2021
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
116
7,639
0
11 May 2021
A Good Image Generator Is What You Need for High-Resolution Video Synthesis
Yu Tian
Jian Ren
Menglei Chai
Kyle Olszewski
Xi Peng
Dimitris N. Metaxas
Sergey Tulyakov
VGen
85
186
0
30 Apr 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffM
VGen
42
235
0
30 Apr 2021
1
2
Next