ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLGithub (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 332 papers shown
Title
Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration
Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration
Zhiyu Zhu
Jinhui Hou
Hui Liu
H. Zeng
Junhui Hou
81
0
0
07 Oct 2024
Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting
Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting
Soon Hoe Lim
Yijin Wang
Annan Yu
Emma Hart
Michael W. Mahoney
Xiaoye S. Li
N. Benjamin Erichson
AI4TS
109
2
0
04 Oct 2024
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Seyedmorteza Sadat
Otmar Hilliges
Romann M. Weber
DiffM
58
13
0
03 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
161
35
0
03 Oct 2024
IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models
IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models
Tuo An
Yunjiao Zhou
Han Zou
Jianfei Yang
LRM
90
9
0
03 Oct 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
169
11
0
03 Oct 2024
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Anthony Zhou
Zijie Li
Michael Schneier
John R Buchanan Jr
Amir Barati Farimani
AI4CEDiffM
170
8
0
02 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRLOnRLLM&Ro
133
4
0
01 Oct 2024
MIO: A Foundation Model on Multimodal Tokens
MIO: A Foundation Model on Multimodal Tokens
Zekun Wang
King Zhu
Chunpu Xu
Wangchunshu Zhou
Jiaheng Liu
...
Yuanxing Zhang
Ge Zhang
Ke Xu
Jie Fu
Wenhao Huang
MLLMAuLLM
173
12
0
26 Sep 2024
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models
A. Popov
Alperen Degirmenci
David Wehr
Shashank Hegde
Ryan Oldja
...
David Nistér
Urs Muller
Ruchi Bhargava
Stan Birchfield
Nikolai Smolyanskiy
153
11
0
25 Sep 2024
Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model
Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model
Hongliang Zhong
Can Wang
Jingbo Zhang
Jing Liao
3DGSDiffM
84
2
0
25 Sep 2024
Dormant: Defending against Pose-driven Human Image Animation
Dormant: Defending against Pose-driven Human Image Animation
Jiachen Zhou
Mingsi Wang
Tianlin Li
Guozhu Meng
Kai Chen
160
5
0
22 Sep 2024
OSV: One Step is Enough for High-Quality Image to Video Generation
OSV: One Step is Enough for High-Quality Image to Video Generation
Xiaofeng Mao
Zhengkai Jiang
Fu-Yun Wang
Wenbing Zhu
Hao Chen
Mingmin Chi
Yabiao Wang
Wenhan Luo
DiffMVGen
129
13
0
17 Sep 2024
DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes
DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes
Jianbiao Mei
T. Hu
Xuemeng Yang
Licheng Wen
Yu Yang
Tiantian Wei
Yukai Ma
Min Dou
Botian Shi
Yong Liu
VGenDiffM
170
6
0
06 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
140
23
0
05 Sep 2024
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention
Gaojie Lin
Jianwen Jiang
Chao Liang
Tianyun Zhong
Jiaqi Yang
Yanbo Zheng
VGenDiffM
140
19
0
03 Sep 2024
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
Fan Liu
Wenqiang Sun
Hanyang Wang
Yikai Wang
Haowen Sun
Junliang Ye
Jun Zhang
Yueqi Duan
VGen
114
41
0
29 Aug 2024
Diffusion Models Are Real-Time Game Engines
Diffusion Models Are Real-Time Game Engines
Dani Valevski
Yaniv Leviathan
Moab Arar
Shlomi Fruchter
DiffMVGenAI4CE
139
91
0
27 Aug 2024
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Xiaojuan Wang
Boyang Zhou
Brian L. Curless
Ira Kemelmacher-Shlizerman
Aleksander Holynski
Steven M. Seitz
DiffM
122
17
0
27 Aug 2024
Atlas Gaussians Diffusion for 3D Generation
Atlas Gaussians Diffusion for 3D Generation
Haitao Yang
Yuan Dong
Hanwen Jiang
Dejia Xu
Georgios Pavlakos
Qixing Huang
3DGS
189
3
0
23 Aug 2024
Real-Time Video Generation with Pyramid Attention Broadcast
Real-Time Video Generation with Pyramid Attention Broadcast
Xuanlei Zhao
Xiaolong Jin
Kai Wang
Yang You
VGenDiffM
180
45
0
22 Aug 2024
TrackGo: A Flexible and Efficient Method for Controllable Video Generation
TrackGo: A Flexible and Efficient Method for Controllable Video Generation
Haitao Zhou
Chuang Wang
Rui Nie
Jinxiao Lin
Dongdong Yu
Qian Yu
Changhu Wang
VGenDiffM
162
15
0
21 Aug 2024
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
Liu He
Yizhi Song
Hejun Huang
Pinxin Liu
Yunlong Tang
Daniel G. Aliaga
Xin Zhou
DiffMVGen
146
6
0
19 Aug 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffMVGen
308
565
0
12 Aug 2024
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion
Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion
Manuel Kansy
Jacek Naruniec
Christopher Schroers
Markus Gross
Romann M. Weber
DiffMVGen
127
4
0
01 Aug 2024
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Zhenghao Zhang
Junchao Liao
Menghao Li
Zuozhuo Dai
Bingxue Qiu
Hao Hu
Shaowei Cai
Weizhi Wang
VGen
171
57
0
31 Jul 2024
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
Yiming Xie
Chun-Han Yao
Vikram S. Voleti
Huaizu Jiang
Varun Jampani
VGen
149
47
0
24 Jul 2024
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
Huiguo He
Huan Yang
Zixi Tuo
Yuan Zhou
Qiuyue Wang
Yuhang Zhang
Zeyu Liu
Wenhao Huang
Hongyang Chao
Jian Yin
DiffMVGen
200
17
0
17 Jul 2024
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Sherwin Bahmani
Ivan Skorokhodov
Aliaksandr Siarohin
Willi Menapace
Guocheng Qian
...
Chaoyang Wang
Jiaxu Zou
Andrea Tagliasacchi
David B. Lindell
Sergey Tulyakov
VGenDiffM
205
50
0
17 Jul 2024
Kinetic Typography Diffusion Model
Kinetic Typography Diffusion Model
Seonmi Park
Inhwan Bae
Seunghyun Shin
Hae-Gon Jeon
DiffM
111
2
0
15 Jul 2024
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
Yibo Miao
Yifan Zhu
Yinpeng Dong
Lijia Yu
Jun Zhu
Xiao-Shan Gao
EGVM
127
20
0
08 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
152
93
0
02 Jul 2024
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat
Manuel Kansy
Otmar Hilliges
Romann M. Weber
91
14
0
02 Jul 2024
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Yuang Zhang
Jiaxi Gu
L. Wang
Han Wang
Junqi Cheng
Yuefeng Zhu
Fangyuan Zou
VGen
161
85
0
28 Jun 2024
Text-Animator: Controllable Visual Text Video Generation
Text-Animator: Controllable Visual Text Video Generation
Lin Liu
Quande Liu
Shengju Qian
Yuan Zhou
Wengang Zhou
Houqiang Li
Lingxi Xie
Qi Tian
VGen
96
1
0
25 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
180
39
0
24 Jun 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
113
12
0
20 Jun 2024
Training-free Camera Control for Video Generation
Training-free Camera Control for Video Generation
Chen Hou
Guoqiang Wei
VGenDiffM
199
40
0
14 Jun 2024
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
Desai Xie
Sai Bi
Zhixin Shu
Kai Zhang
Zexiang Xu
Yi Zhou
Soren Pirk
Arie E. Kaufman
Xin Sun
Hao Tan
SyDa
107
17
0
13 Jun 2024
WonderWorld: Interactive 3D Scene Generation from a Single Image
WonderWorld: Interactive 3D Scene Generation from a Single Image
Hong-Xing Yu
Haoyi Duan
Charles Herrmann
William T. Freeman
Jiajun Wu
3DGSVGen
213
46
0
13 Jun 2024
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Rishit Dagli
Shivesh Prakash
Robert Wu
H. Khosravani
141
6
0
06 Jun 2024
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Stanislaw Szymanowicz
Eldar Insafutdinov
Chuanxia Zheng
Dylan Campbell
João F. Henriques
Christian Rupprecht
Andrea Vedaldi
3DGS
118
56
0
06 Jun 2024
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Hao Wen
Zehuan Huang
Yaohui Wang
Xinyuan Chen
Yu Qiao
159
9
0
05 Jun 2024
Turning Text and Imagery into Captivating Visual Video
Turning Text and Imagery into Captivating Visual Video
Mingming Wang
Elijah Miller
VGen
64
0
0
03 Jun 2024
Learning Temporally Consistent Video Depth from Video Diffusion Priors
Learning Temporally Consistent Video Depth from Video Diffusion Priors
Jiahao Shao
Yuanbo Yang
Hongyu Zhou
Youmin Zhang
Yujun Shen
Vitor Campagnolo Guizilini
Yue Wang
Matteo Poggi
Yiyi Liao
VGenDiffMMDE
123
43
0
03 Jun 2024
EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical
  Data Sharing
EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing
Hadrien Reynaud
Qingjie Meng
Mischa Dombrowski
Arijit Ghosh
Thomas Day
Alberto Gomez
Paul Leeson
Bernhard Kainz
MedIm
78
7
0
02 Jun 2024
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion
Jiangkai Wu
Liming Liu
Yunpeng Tan
Junlin Hao
Xinggong Zhang
143
3
0
30 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
167
103
0
27 May 2024
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang
Yukun Zhou
Mingjia Shi
Zhihang Yuan
Yuzhang Shang
Yuzhang Shang
Hanwang Zhang
Hanwang Zhang
Yang You
153
14
0
27 May 2024
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion
  Models
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models
Wenqi Ouyang
Yi Dong
Lei Yang
Jianlou Si
Xingang Pan
VGenDiffM
98
16
0
26 May 2024
Previous
1234567
Next