ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLGithub (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 332 papers shown
Title
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Yushu Wu
Zhixing Zhang
Yanyu Li
Yanwu Xu
Anil Kag
...
Ju Hu
Dimitris N. Metaxas
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VGenDiffM
165
5
0
13 Dec 2024
T-SVG: Text-Driven Stereoscopic Video Generation
T-SVG: Text-Driven Stereoscopic Video Generation
Qiao Jin
Xiaodong Chen
Wu Liu
Tao Mei
Yongdong Zhang
DiffMVGen
135
2
0
12 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
236
2
0
12 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
252
6
0
10 Dec 2024
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Baorui Ma
Huachen Gao
Haoge Deng
Zhengxiong Luo
Tiejun Huang
Lulu Tang
Xinlong Wang
DiffMVGen
266
16
0
09 Dec 2024
Birth and Death of a Rose
Birth and Death of a Rose
Chen Geng
Yunzhi Zhang
Shangzhe Wu
Jiajun Wu
AI4CE
118
2
0
06 Dec 2024
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models
InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models
Yifan Lu
Xuanchi Ren
Jiawei Yang
Tianchang Shen
Zhangjie Wu
...
Yanjie Wang
Siheng Chen
Mike Chen
Sanja Fidler
Jiahui Huang
VGen
180
9
0
05 Dec 2024
Navigation World Models
Navigation World Models
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGenEgoV
212
33
0
04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
132
6
0
02 Dec 2024
Human Action CLIPs: Detecting AI-generated Human Motion
Human Action CLIPs: Detecting AI-generated Human Motion
Matyáš Boháček
Hany Farid
143
4
0
30 Nov 2024
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Yuhang Zhang
Yuan Zhou
Zeyu Liu
Yuxuan Cai
Qiuyue Wang
Aidong Men
Huan Yang
VGenDiffM
130
1
0
29 Nov 2024
Motion Modes: What Could Happen Next?
Karran Pandey
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy J. Mitra
Paul Guerrero
VGenDiffM
140
2
0
29 Nov 2024
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Hui Li
Mingwang Xu
Yun Zhan
Shan Mu
Jiaye Li
...
Yukang Chen
Tan Chen
Mao Ye
Jingdong Wang
Siyu Zhu
VGen
210
7
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGenAI4TS
216
30
0
28 Nov 2024
PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors
PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors
Guangshun Wei
Yuan Feng
Long Ma
Chen Wang
Yuanfeng Zhou
Changjian Li
566
0
0
28 Nov 2024
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Zongjian Li
Bin Lin
Yang Ye
Liuhan Chen
Xinhua Cheng
Shenghai Yuan
Li-xin Yuan
VGenDiffM
170
20
0
26 Nov 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
VGenDiffM
211
4
0
25 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
148
11
0
25 Nov 2024
Generative Omnimatte: Learning to Decompose Video into Layers
Generative Omnimatte: Learning to Decompose Video into Layers
Yao-Chih Lee
Erika Lu
Sarah Rumbley
Michal Geyer
Jia-Bin Huang
Tali Dekel
Forrester Cole
DiffMVGen
203
8
0
25 Nov 2024
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
Chenjie Cao
Chaohui Yu
Shang Liu
Fan Wang
Xiangyang Xue
Yanwei Fu
148
2
0
25 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Soumava Paul
Prakhar Kaushik
Alan Yuille
3DGSDiffM
544
0
0
24 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffMVGen
230
3
0
22 Nov 2024
PhysFlow: Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation
PhysFlow: Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation
Zhuoman Liu
Weicai Ye
Yan Luximon
Pengfei Wan
Di Zhang
VGenAI4CE
181
6
0
21 Nov 2024
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
Zhen Lv
Yangqi Long
Congzhentao Huang
Cao Li
Chengfei Lv
Hao Ren
Dian Zheng
DiffMVGenMDE
189
6
0
18 Nov 2024
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Wanquan Feng
Jiawei Liu
Pengqi Tu
Tianhao Qi
Mingzhen Sun
Tianxiang Ma
Mingcong Liu
Siyu Zhou
Qian He
VGen
173
10
0
10 Nov 2024
DimensionX: Create Any 3D and 4D Scenes from a Single Image with
  Controllable Video Diffusion
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
Wenqiang Sun
Shuo Chen
Fan Liu
Zilong Chen
Yueqi Duan
Jun Zhang
Yikai Wang
VGen
121
41
0
07 Nov 2024
StoryAgent: Customized Storytelling Video Generation via Multi-Agent
  Collaboration
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Panwen Hu
Jin Jiang
Jianqi Chen
Mingfei Han
Shengcai Liao
Xiaojun Chang
Xiaodan Liang
VGenDiffM
128
6
0
07 Nov 2024
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Koichi Namekata
Sherwin Bahmani
Ziyi Wu
Yash Kant
Igor Gilitschenski
David B. Lindell
VGen
171
16
0
07 Nov 2024
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
Xiufeng Song
Xiao Guo
Junxuan Zhang
Qirui Li
Lei Bai
Xiaoming Liu
Guangtao Zhai
Xiaohong Liu
VGenDiffM
177
12
0
31 Oct 2024
Video prediction using score-based conditional density estimation
Video prediction using score-based conditional density estimation
P. Fiquet
Eero P. Simoncelli
AI4TS
48
0
0
30 Oct 2024
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
Ji Guo
Wenbo Jiang
Rui Zhang
Guoming Lu
Hongwei Li
AAML
160
0
0
30 Oct 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
Jiajian Li
H. Ling
Furu Wei
VGenDiffM
150
7
0
27 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
143
15
0
24 Oct 2024
FreeVS: Generative View Synthesis on Free Driving Trajectory
FreeVS: Generative View Synthesis on Free Driving Trajectory
Qitai Wang
Lue Fan
Yuqi Wang
Yuntao Chen
Zhaoxiang Zhang
VGen
114
8
0
23 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
163
13
0
21 Oct 2024
EVA: An Embodied World Model for Future Video Anticipation
EVA: An Embodied World Model for Future Video Anticipation
Xiaowei Chi
Hengyuan Zhang
Chun-Kai Fan
Xingqun Qi
Rongyu Zhang
...
Chi-Min Chan
Wei Xue
Wenhan Luo
Shanghang Zhang
Yike Guo
VGen
91
8
0
20 Oct 2024
FrameBridge: Improving Image-to-Video Generation with Bridge Models
FrameBridge: Improving Image-to-Video Generation with Bridge Models
Yuji Wang
Zehua Chen
Xiaoyu Chen
Jun-Jie Zhu
Jianfei Chen
Jianfei Chen
DiffMVGen
513
5
0
20 Oct 2024
Assessing Open-world Forgetting in Generative Image Model Customization
Assessing Open-world Forgetting in Generative Image Model Customization
Héctor Laria
Alex Gomez-Villa
Imad Eddine Marouf
Bogdan Raducanu
Bogdan Raducanu
VLMDiffM
112
0
0
18 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving
  Scene Representation
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Xinming Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
174
32
0
17 Oct 2024
DepthSplat: Connecting Gaussian Splatting and Depth
DepthSplat: Connecting Gaussian Splatting and Depth
Haofei Xu
Songyou Peng
Fangjinhua Wang
Hermann Blum
Dániel Baráth
Andreas Geiger
Marc Pollefeys
3DGSMDE
119
39
0
17 Oct 2024
Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing
Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing
Mingce Guo
Jingxuan He
Shengeng Tang
Zhangye Wang
Lechao Cheng
VGenDiffM
136
0
0
16 Oct 2024
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li
Dinesh Manocha
MoE
157
19
0
14 Oct 2024
Distillation of Discrete Diffusion through Dimensional Correlations
Distillation of Discrete Diffusion through Dimensional Correlations
Satoshi Hayakawa
Yuhta Takida
Masaaki Imaizumi
Hiromi Wakaki
Yuki Mitsufuji
DiffM
170
4
0
11 Oct 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image
  Animation
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffMVGen
101
29
0
10 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
150
11
0
10 Oct 2024
Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content
Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content
Qiuheng Wang
Yukai Shi
Jiarong Ou
Ruoxin Chen
Ke Lin
...
Mingwu Zheng
Xin Tao
Fei Yang
Pengfei Wan
Di Zhang
VGen
155
34
0
10 Oct 2024
Progressive Autoregressive Video Diffusion Models
Progressive Autoregressive Video Diffusion Models
Desai Xie
Zhan Xu
Yicong Hong
Hao Tan
Difan Liu
Feng Liu
Arie E. Kaufman
Yang Zhou
DiffMVGen
127
15
0
10 Oct 2024
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
Yukang Cao
Liang Pan
Kai Han
Kwan-Yee K. Wong
Ziwei Liu
VGen
129
6
0
09 Oct 2024
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
Serin Yang
Taesung Kwon
Jong Chul Ye
VGenDiffM
111
7
0
08 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
168
87
0
08 Oct 2024
Previous
1234567
Next