Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.15127
Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"
50 / 214 papers shown
Title
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
32
1
0
12 May 2025
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Panwen Hu
Jiehui Huang
Qiang Sun
Xiaodan Liang
DiffM
VGen
28
0
0
11 May 2025
DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models
Junhao Xia
Chaoyang Zhang
Yecheng Zhang
Chengyang Zhou
Zhichang Wang
Bochun Liu
Dongshuo Yin
DiffM
VGen
31
0
0
11 May 2025
Jailbreaking the Text-to-Video Generative Models
Jiayang Liu
Siyuan Liang
Shiqian Zhao
Rongcheng Tu
Wenbo Zhou
Xiaochun Cao
D. Tao
Siew Kei Lam
EGVM
VGen
46
0
0
10 May 2025
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
Xianghao Kong
Qiaosong Qi
Yuanbin Wang
Anyi Rao
Biaolong Chen
Aixi Zhang
Si Liu
Hao Jiang
DiffM
VGen
25
0
0
10 May 2025
Demystifying Diffusion Policies: Action Memorization and Simple Lookup Table Alternatives
Chengyang He
Xu Liu
Gadiel Sznaier Camps
Guillaume Sartoretti
Mac Schwager
28
0
0
09 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
26
0
0
09 May 2025
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
Jiale Zhao
VGen
142
0
0
08 May 2025
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
Wataru Nakata
Yuma Koizumi
Shigeki Karita
Robin Scheibler
Haruko Ishikawa
Adriana Guevara-Rukoz
Heiga Zen
M. Bacchiani
48
0
0
08 May 2025
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
Wenchuan Wang
Mengqi Huang
Yijing Tu
Zhendong Mao
VGen
69
0
0
04 May 2025
Generating Animated Layouts as Structured Text Representations
Yeonsang Shin
Jihwan Kim
Yumin Song
Kyungseung Lee
Hyunhee Chung
Taeyoung Na
DiffM
VGen
68
0
0
02 May 2025
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
Mohammadreza Teymoorianfard
Shiqing Ma
Amir Houmansadr
WIGM
58
0
0
02 May 2025
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan
Hu Yu
Jie Huang
Jie Xiao
Feng Zhao
69
1
0
02 May 2025
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata
Rodrigo Mira
Stella Bounareli
Michał Stypułkowski
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
52
0
0
01 May 2025
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
Jiale Zhao
EGVM
VGen
PINN
82
1
0
01 May 2025
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
X. Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
60
0
0
30 Apr 2025
Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis
Michal Geyer
Omer Tov
Linyi Jin
Richard Tucker
Inbar Mosseri
Tali Dekel
Noah Snavely
DiffM
VGen
100
0
0
30 Apr 2025
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffM
VGen
78
0
0
30 Apr 2025
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung
DiffM
AI4TS
53
0
0
29 Apr 2025
SynergyAmodal: Deocclude Anything with Text Control
Xinyang Li
Chengjie Yi
Jiawei Lai
Mingbao Lin
Yansong Qu
Shengchuan Zhang
Liujuan Cao
DiffM
73
0
0
28 Apr 2025
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
Junpeng Jiang
Gangyi Hong
Miao Zhang
Hengtong Hu
Kun Zhan
Rui Shao
Liqiang Nie
VGen
51
0
0
28 Apr 2025
Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation
Weipeng Tan
Chuming Lin
Chengming Xu
F. Xu
Xiaobin Hu
Xiaozhong Ji
Junwei Zhu
Chengjie Wang
Yanwei Fu
44
0
0
25 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Qi Dai
Jaesik Park
Chong Luo
DiffM
VGen
110
0
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
X. Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Y. Zhang
Ji Wan
J. Wang
VGen
67
1
0
22 Apr 2025
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Siyuan Liang
Jiayang Liu
Jiecheng Zhai
Tianmeng Fang
Rongcheng Tu
A. Liu
Xiaochun Cao
Dacheng Tao
VGen
58
0
0
22 Apr 2025
U-Shape Mamba: State Space Model for faster diffusion
Alex Ergasti
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
Mamba
89
0
0
18 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffM
VGen
51
1
0
17 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
74
0
0
16 Apr 2025
PT-Mark: Invisible Watermarking for Text-to-image Diffusion Models via Semantic-aware Pivotal Tuning
Y. Wang
Huiyu Xu
Zhibo Wang
Jiacheng Du
Z. Li
Yiming Li
Qiu Wang
Kui Ren
WIGM
52
0
0
15 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGen
MDE
33
0
0
15 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Xiaohui Zeng
VGen
MDE
36
0
0
15 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
38
1
0
14 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
81
1
0
11 Apr 2025
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo
Matthew Wallingford
Ali Farhadi
Noah Snavely
Wei-Chiu Ma
VGen
28
0
0
10 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
69
0
0
09 Apr 2025
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas J. Guibas
Gordon Wetzstein
Sai Bi
DiffM
63
0
0
07 Apr 2025
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Chuning Zhu
Raymond Yu
S. Feng
Benjamin Burchfiel
Paarth Shah
Abhishek Gupta
VGen
55
0
0
03 Apr 2025
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong
Zunnan Xu
Zixiang Zhou
Jun Zhou
Xiu Li
Qin Lin
Qinglin Lu
D. Xu
DiffM
VGen
57
2
0
03 Apr 2025
MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition
Takahiro Shirakawa
Tomoyuki Suzuki
Daichi Haraguchi
VGen
39
0
0
03 Apr 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation
Yuhang Yang
Ke Fan
S.
Hongxiang Li
Ailing Zeng
FeiLin Han
Wei-dong Zhai
W. Liu
Yang Cao
Zheng-jun Zha
EGVM
VGen
73
0
0
30 Mar 2025
GenFusion: Closing the Loop between Reconstruction and Generation via Videos
Sibo Wu
Congrong Xu
Binbin Huang
Andreas Geiger
Anpei Chen
VGen
140
0
0
27 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
84
0
0
26 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
59
3
0
24 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
Yexin Liu
Zelin Peng
Junjun He
Zongyuan Ge
VGen
DiffM
96
1
0
20 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
62
0
0
20 Mar 2025
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Yong Zhong
Zhuoyi Yang
Jiayan Teng
Xiaotao Gu
Chongxuan Li
VGen
63
0
0
18 Mar 2025
MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments
Zhixuan Liu
H. Zhu
R. Chen
Jonathan M Francis
Soonmin Hwang
J. J. Zhang
Jean Oh
VGen
157
0
0
18 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
76
7
0
13 Mar 2025
Motion Anything: Any to Motion Generation
Zeyu Zhang
Yiran Wang
Wei Mao
Danning Li
Rui Zhao
Biao Wu
Zirui Song
Bohan Zhuang
Ian Reid
Richard I. Hartley
DiffM
VGen
53
1
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Y. Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
64
0
0
13 Mar 2025
1
2
3
4
5
Next