ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXiv (abs)PDFHTMLGithub (25943★)

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 332 papers shown
Title
Video Signature: In-generation Watermarking for Latent Video Diffusion Models
Video Signature: In-generation Watermarking for Latent Video Diffusion Models
Yu Huang
Junhao Chen
Qi Zheng
Hanqian Li
Shuliang Liu
Xuming Hu
DiffMWIGMVGen
51
0
0
31 May 2025
MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection
MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection
Liancheng Fang
Aiwei Liu
Henry Peng Zou
Yankai Chen
Hengrui Zhang
Zhongfen Deng
Philip S. Yu
38
0
0
30 May 2025
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation
Yang-tian Sun
Xin Yu
Zehuan Huang
Yi-Hua Huang
Yuan-Chen Guo
Ziyi Yang
Yan-Pei Cao
Xiaojuan Qi
DiffMVGenMDE
46
1
0
30 May 2025
DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds
DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds
Jiaxu Zhang
Xianfang Zeng
Xin Chen
W. Zuo
Gang Yu
Guosheng Lin
Zhigang Tu
DiffM3DGSVGen
41
0
0
30 May 2025
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
Gwanghyun Kim
Xueting Li
Ye Yuan
Koki Nagano
Tianye Li
Jan Kautz
Se Young Chun
Umar Iqbal
DiffM
63
0
0
29 May 2025
Generating Fit Check Videos with a Handheld Camera
Generating Fit Check Videos with a Handheld Camera
B. Chen
Brian L. Curless
Ira Kemelmacher-Shlizerman
Steven M. Seitz
DiffM
30
0
0
29 May 2025
Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing
Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing
Tongtong Su
Chengyu Wang
Jun Huang
Dongming Lu
DiffMVGen
33
0
0
29 May 2025
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization and Temporal Motion Modulation
Jiahao Cui
Yan Chen
Mingwang Xu
Hanlin Shang
Yuxuan Chen
Yun Zhan
Zilong Dong
Yao Yao
Jingdong Wang
Siyu Zhu
DiffMVGen
60
0
0
29 May 2025
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
Yifei Xia
Shuchen Weng
Siqi Yang
Jingqi Liu
Chengxuan Zhu
Minggui Teng
Zijian Jia
Han Jiang
Boxin Shi
DiffMVGen
100
0
0
28 May 2025
ATI: Any Trajectory Instruction for Controllable Video Generation
ATI: Any Trajectory Instruction for Controllable Video Generation
Angtian Wang
Haibin Huang
Jacob Zhiyuan Fang
Yiding Yang
Chongyang Ma
DiffMVGen
77
0
0
28 May 2025
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
Anthony Chen
Wenzhao Zheng
Yida Wang
Xueyang Zhang
Kun Zhan
Peng Jia
Kurt Keutzer
Shanghang Zhang
105
1
0
28 May 2025
Frame-Level Captions for Long Video Generation with Complex Multi Scenes
Frame-Level Captions for Long Video Generation with Complex Multi Scenes
Guangcong Zheng
Jianlong Yuan
Bo Wang
Haoyang Huang
Guoqing Ma
Nan Duan
DiffMVGen
81
0
0
27 May 2025
Advancing high-fidelity 3D and Texture Generation with 2.5D latents
Advancing high-fidelity 3D and Texture Generation with 2.5D latents
Xin Yang
Jiantao Lin
Yingjie Xu
Haodong Li
Yingcong Chen
3DV
62
0
0
27 May 2025
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang
Xuweiyi Chen
Matheus Gadelha
Zezhou Cheng
DiffMVGen
74
0
0
27 May 2025
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Ying Shan
Li Yuan
VGen
81
0
0
27 May 2025
Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM
Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM
Peng Liu
Xiaoming Ren
Fengkai Liu
Qingsong Xie
Quanlong Zheng
Yanhao Zhang
Haonan Lu
Yujiu Yang
EGVMVGen
74
0
0
26 May 2025
HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters
HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters
Yi Chen
Sen Liang
Zixiang Zhou
Ziyao Huang
Yifeng Ma
Junshu Tang
Qin Lin
Yuan Zhou
Qinglin Lu
VGen
47
0
0
26 May 2025
Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
Nate Gillman
Charles Herrmann
Michael Freeman
Daksh Aggarwal
Evan Luo
Deqing Sun
Chen Sun
VGenAI4CE
93
0
0
26 May 2025
Adaptive Diffusion Guidance via Stochastic Optimal Control
Adaptive Diffusion Guidance via Stochastic Optimal Control
Iskander Azangulov
Peter Potaptchik
Qinyu Li
Eddie Aamari
George Deligiannidis
Judith Rousseau
25
0
0
25 May 2025
Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation
Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation
Alexander Shabalin
Viacheslav Meshchaninov
Dmitry Vetrov
44
0
0
24 May 2025
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos
Xiaodong Wang
Peixi Peng
VGen
1.1K
1
0
24 May 2025
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
Yiren Song
Cheng Liu
Mike Zheng Shou
DiffM
178
2
0
24 May 2025
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Yujin Jeong
Arnas Uselis
Seong Joon Oh
Anna Rohrbach
DiffMCoGe
568
0
3
23 May 2025
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
N. Benjamin Erichson
Vinicius Mikuni
Dongwei Lyu
Yang Gao
Omri Azencot
Soon Hoe Lim
Michael W. Mahoney
AI4CE
898
0
0
23 May 2025
WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions
Zizhang Li
Hong-Xing Yu
Wei Liu
Yin Yang
Charles Herrmann
Gordon Wetzstein
Jiajun Wu
VGen
79
0
0
23 May 2025
SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Jiawei Zhou
Linye Lyu
Zhuotao Tian
Cheng Zhuo
Yu Li
VGen
74
0
0
23 May 2025
Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression
Yuning Shen
Lihao Wang
Huizhuo Yuan
Yan Wang
B. Yang
Quanquan Gu
DiffMAI4CE
158
0
0
23 May 2025
Learning to Integrate Diffusion ODEs by Averaging the Derivatives
Learning to Integrate Diffusion ODEs by Averaging the Derivatives
Wenze Liu
Xiangyu Yue
80
0
0
20 May 2025
Programmatic Video Prediction Using Large Language Models
Programmatic Video Prediction Using Large Language Models
Hao Tang
Kevin Ellis
Suhas Lohit
Michael J. Jones
Moitreya Chatterjee
VGen
104
0
0
20 May 2025
Video-GPT via Next Clip Diffusion
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffMVGen
241
0
0
18 May 2025
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcal{O}(T)$ Complexity
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T)\mathcal{O}(T)O(T) Complexity
Shihao Zou
Qingfeng Li
Wei Ji
Jingjing Li
Yongkui Yang
Guoqi Li
Chao Dong
184
0
0
15 May 2025
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models
Hu Yue
Siyuan Huang
Yue Liao
Shengcong Chen
Pengfei Zhou
Liliang Chen
Maoqing Yao
Guanghui Ren
VGen
82
1
0
14 May 2025
Generative Pre-trained Autoregressive Diffusion Transformer
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
136
2
0
12 May 2025
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Panwen Hu
Jiehui Huang
Qiang Sun
Xiaodan Liang
DiffMVGen
102
0
0
11 May 2025
Demystifying Diffusion Policies: Action Memorization and Simple Lookup Table Alternatives
Demystifying Diffusion Policies: Action Memorization and Simple Lookup Table Alternatives
Chengyang He
Xu Liu
Gadiel Sznaier Camps
Guillaume Sartoretti
Mac Schwager
73
1
0
09 May 2025
Generating Animated Layouts as Structured Text Representations
Generating Animated Layouts as Structured Text Representations
Yeonsang Shin
Jihwan Kim
Yumin Song
Kyungseung Lee
Hyunhee Chung
Taeyoung Na
DiffMVGen
108
0
0
02 May 2025
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
Mohammadreza Teymoorianfard
Shiqing Ma
Amir Houmansadr
WIGM
130
0
0
02 May 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffMVGen
158
1
0
30 Apr 2025
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
Xuming Hu
Haoyang Li
Jiajun Li
Yu Huang
Aiwei Liu
WIGMVGen
141
3
0
23 Apr 2025
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
Kuanting Wu
Kei Ota
Asako Kanezaki
DiffMVGen
118
0
0
20 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
182
1
0
18 Apr 2025
SOPHY: Learning to Generate Simulation-Ready Objects with Physical Materials
SOPHY: Learning to Generate Simulation-Ready Objects with Physical Materials
Junyi Cao
Evangelos Kalogerakis
AI4CE
67
0
0
17 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
180
1
0
16 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGenMDE
157
0
0
15 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Fangyin Wei
VGenMDE
119
1
0
15 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Yansen Wang
Huiyu Xu
Peng Kuang
Jiacheng Du
Zehan Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
164
0
0
15 Apr 2025
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting
GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting
Junlin Hao
Peiheng Wang
Haoyang Wang
Xinggong Zhang
Xinggong Zhang
3DGSVGen
158
0
0
14 Apr 2025
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise
Chao Liu
Arash Vahdat
DiffMVGen
97
2
0
14 Apr 2025
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
Bernhard Kainz
Stefanos Zafeiriou
DiffM
97
0
0
14 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
184
22
0
11 Apr 2025
Previous
1234567
Next