Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Github (25943★)
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 332 papers shown
Title
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VGen
DiffM
165
5
0
13 Dec 2024
T-SVG: Text-Driven Stereoscopic Video Generation
Qiao Jin
Xiaodong Chen
Wu Liu
Tao Mei
Yongdong Zhang
DiffM
VGen
135
2
0
12 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
236
2
0
12 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
252
6
0
10 Dec 2024
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Baorui Ma
Huachen Gao
Haoge Deng
Zhengxiong Luo
Tiejun Huang
Lulu Tang
Xinlong Wang
DiffM
VGen
266
16
0
09 Dec 2024
Birth and Death of a Rose
Chen Geng
Yunzhi Zhang
Shangzhe Wu
Jiajun Wu
AI4CE
118
2
0
06 Dec 2024
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models
Yifan Lu
Xuanchi Ren
Jiawei Yang
Tianchang Shen
Zhangjie Wu
...
Yanjie Wang
Siheng Chen
Mike Chen
Sanja Fidler
Jiahui Huang
VGen
180
9
0
05 Dec 2024
Navigation World Models
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGen
EgoV
212
33
0
04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
132
6
0
02 Dec 2024
Human Action CLIPs: Detecting AI-generated Human Motion
Matyáš Boháček
Hany Farid
143
4
0
30 Nov 2024
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Yuhang Zhang
Yuan Zhou
Zeyu Liu
Yuxuan Cai
Qiuyue Wang
Aidong Men
Huan Yang
VGen
DiffM
130
1
0
29 Nov 2024
Motion Modes: What Could Happen Next?
Karran Pandey
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy J. Mitra
Paul Guerrero
VGen
DiffM
140
2
0
29 Nov 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Yukang Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
210
7
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGen
AI4TS
216
30
0
28 Nov 2024
PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors
Guangshun Wei
Yuan Feng
Long Ma
Chen Wang
Yuanfeng Zhou
Changjian Li
566
0
0
28 Nov 2024
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Zongjian Li
Bin Lin
Yang Ye
Liuhan Chen
Xinhua Cheng
Shenghai Yuan
Li-xin Yuan
VGen
DiffM
170
20
0
26 Nov 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
VGen
DiffM
211
4
0
25 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
148
11
0
25 Nov 2024
Generative Omnimatte: Learning to Decompose Video into Layers
Yao-Chih Lee
Erika Lu
Sarah Rumbley
Michal Geyer
Jia-Bin Huang
Tali Dekel
Forrester Cole
DiffM
VGen
203
8
0
25 Nov 2024
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
Chenjie Cao
Chaohui Yu
Shang Liu
Fan Wang
Xiangyang Xue
Yanwei Fu
148
2
0
25 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Soumava Paul
Prakhar Kaushik
Alan Yuille
3DGS
DiffM
544
0
0
24 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
230
3
0
22 Nov 2024
PhysFlow: Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation
Zhuoman Liu
Weicai Ye
Yan Luximon
Pengfei Wan
Di Zhang
VGen
AI4CE
181
6
0
21 Nov 2024
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
Zhen Lv
Yangqi Long
Congzhentao Huang
Cao Li
Chengfei Lv
Hao Ren
Dian Zheng
DiffM
VGen
MDE
189
6
0
18 Nov 2024
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Wanquan Feng
Jiawei Liu
Pengqi Tu
Tianhao Qi
Mingzhen Sun
Tianxiang Ma
Mingcong Liu
Siyu Zhou
Qian He
VGen
173
10
0
10 Nov 2024
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Wenqiang Sun
Shuo Chen
Fan Liu
Zilong Chen
Yueqi Duan
Jun Zhang
Yikai Wang
VGen
121
41
0
07 Nov 2024
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Panwen Hu
Jin Jiang
Jianqi Chen
Mingfei Han
Shengcai Liao
Xiaojun Chang
Xiaodan Liang
VGen
DiffM
128
6
0
07 Nov 2024
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Koichi Namekata
Sherwin Bahmani
Ziyi Wu
Yash Kant
Igor Gilitschenski
David B. Lindell
VGen
171
16
0
07 Nov 2024
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
Xiufeng Song
Xiao Guo
Junxuan Zhang
Qirui Li
Lei Bai
Xiaoming Liu
Guangtao Zhai
Xiaohong Liu
VGen
DiffM
177
12
0
31 Oct 2024
Video prediction using score-based conditional density estimation
P. Fiquet
Eero P. Simoncelli
AI4TS
48
0
0
30 Oct 2024
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
Ji Guo
Wenbo Jiang
Rui Zhang
Guoming Lu
Hongwei Li
AAML
160
0
0
30 Oct 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
Jiajian Li
H. Ling
Furu Wei
VGen
DiffM
150
7
0
27 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
143
15
0
24 Oct 2024
FreeVS: Generative View Synthesis on Free Driving Trajectory
Qitai Wang
Lue Fan
Yuqi Wang
Yuntao Chen
Zhaoxiang Zhang
VGen
114
8
0
23 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
163
13
0
21 Oct 2024
EVA: An Embodied World Model for Future Video Anticipation
Xiaowei Chi
Hengyuan Zhang
Chun-Kai Fan
Xingqun Qi
Rongyu Zhang
...
Chi-Min Chan
Wei Xue
Wenhan Luo
Shanghang Zhang
Yike Guo
VGen
91
8
0
20 Oct 2024
FrameBridge: Improving Image-to-Video Generation with Bridge Models
Yuji Wang
Zehua Chen
Xiaoyu Chen
Jun-Jie Zhu
Jianfei Chen
Jianfei Chen
DiffM
VGen
513
5
0
20 Oct 2024
Assessing Open-world Forgetting in Generative Image Model Customization
Héctor Laria
Alex Gomez-Villa
Imad Eddine Marouf
Bogdan Raducanu
Bogdan Raducanu
VLM
DiffM
112
0
0
18 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Xinming Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
174
32
0
17 Oct 2024
DepthSplat: Connecting Gaussian Splatting and Depth
Haofei Xu
Songyou Peng
Fangjinhua Wang
Hermann Blum
Dániel Baráth
Andreas Geiger
Marc Pollefeys
3DGS
MDE
119
39
0
17 Oct 2024
Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing
Mingce Guo
Jingxuan He
Shengeng Tang
Zhangye Wang
Lechao Cheng
VGen
DiffM
136
0
0
16 Oct 2024
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li
Dinesh Manocha
MoE
157
19
0
14 Oct 2024
Distillation of Discrete Diffusion through Dimensional Correlations
Satoshi Hayakawa
Yuhta Takida
Masaaki Imaizumi
Hiromi Wakaki
Yuki Mitsufuji
DiffM
170
4
0
11 Oct 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffM
VGen
101
29
0
10 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
150
11
0
10 Oct 2024
Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content
Qiuheng Wang
Yukai Shi
Jiarong Ou
Ruoxin Chen
Ke Lin
...
Mingwu Zheng
Xin Tao
Fei Yang
Pengfei Wan
Di Zhang
VGen
155
34
0
10 Oct 2024
Progressive Autoregressive Video Diffusion Models
Desai Xie
Zhan Xu
Yicong Hong
Hao Tan
Difan Liu
Feng Liu
Arie E. Kaufman
Yang Zhou
DiffM
VGen
127
15
0
10 Oct 2024
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
Yukang Cao
Liang Pan
Kai Han
Kwan-Yee K. Wong
Ziwei Liu
VGen
129
6
0
09 Oct 2024
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
Serin Yang
Taesung Kwon
Jong Chul Ye
VGen
DiffM
111
7
0
08 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
168
87
0
08 Oct 2024
Previous
1
2
3
4
5
6
7
Next