ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15127
  4. Cited By
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

25 November 2023
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
Dominik Lorenz
Yam Levi
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
    VGen
ArXivPDFHTML

Papers citing "Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets"

50 / 226 papers shown
Title
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Yong Zhong
Zhuoyi Yang
Jiayan Teng
Xiaotao Gu
Chongxuan Li
VGen
63
0
0
18 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
53
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Yu Qiao
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
R^RRFLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
45
0
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yunhong Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
69
0
0
13 Mar 2025
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
Evgeniia Vu
Andrei Boiarov
Dmitry Vetrov
VGen
50
0
0
13 Mar 2025
Motion Anything: Any to Motion Generation
Zeyu Zhang
Yiran Wang
Wei Mao
Danning Li
Rui Zhao
Biao Wu
Zirui Song
Bohan Zhuang
Ian Reid
Richard I. Hartley
DiffM
VGen
55
1
0
13 Mar 2025
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling
Itay Chachy
Guy Yariv
Sagie Benaim
150
0
0
12 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Fu Liu
Peng Jia
Xianpeng Lang
Xiaolong Sun
VGen
158
0
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
163
0
0
12 Mar 2025
High-Quality 3D Head Reconstruction from Any Single Portrait Image
High-Quality 3D Head Reconstruction from Any Single Portrait Image
Jianfu Zhang
yujie Gao
Jiahui Zhan
Wentao Wang
Yiyi Zhang
H. Zhao
Liqing Zhang
3DH
47
0
0
11 Mar 2025
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
Huiyang Shao
Xin Xia
Yanting Yang
Yuxi Ren
Xing Wang
Xuefeng Xiao
56
1
0
10 Mar 2025
Generative Video Bi-flow
Chen Liu
Tobias Ritschel
DiffM
VGen
53
0
0
09 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
73
0
0
08 Mar 2025
GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation
Ye Tao
Jiawei Zhang
Yahao Shi
Dongqing Zou
Bin Zhou
3DGS
52
0
0
08 Mar 2025
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video
Yue Gao
Hong-Xing Yu
Bo Zhu
Jiajun Wu
VGen
64
1
0
06 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Anton van den Hengel
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
59
1
0
05 Mar 2025
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn
Z. Qureshi
Jakub Powierza
Jamie Watson
Mohamed Sayed
3DGS
DiffM
76
0
0
03 Mar 2025
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
Antoni Bigata
Michał Stypułkowski
Rodrigo Mira
Stella Bounareli
Konstantinos Vougioukas
Zoe Landgraf
Nikita Drobyshev
Maciej Ziȩba
Stavros Petridis
M. Pantic
DiffM
VGen
67
2
0
03 Mar 2025
Unified Video Action Model
Unified Video Action Model
Shuang Li
Yihuai Gao
Dorsa Sadigh
Shuran Song
VGen
50
2
0
28 Feb 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
69
0
0
28 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
92
0
0
27 Feb 2025
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis
TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis
Menghao Li
Zhenghao Zhang
Junchao Liao
Long Qin
Weizhi Wang
DiffM
VGen
69
0
0
26 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
43
0
0
24 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
66
5
0
24 Feb 2025
Text-to-Image Rectified Flow as Plug-and-Play Priors
Text-to-Image Rectified Flow as Plug-and-Play Priors
Xiaofeng Yang
Cheng Chen
Xulei Yang
Fayao Liu
Guosheng Lin
DiffM
73
7
0
21 Feb 2025
Accelerating Diffusion Transformers with Token-wise Feature Caching
Accelerating Diffusion Transformers with Token-wise Feature Caching
Chang Zou
Xuyang Liu
Ting Liu
Siteng Huang
Linfeng Zhang
54
14
0
20 Feb 2025
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
Kaixin Yao
Longwen Zhang
Xinhao Yan
Yan Zeng
Qixuan Zhang
Wei Yang
Lan Xu
Jiayuan Gu
Jingyi Yu
29
3
0
18 Feb 2025
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
Xinlong Chen
Yang Zhang
Chongling Rao
Yushuo Guan
Jiaheng Liu
Fuzheng Zhang
Chengru Song
Qiang Liu
Di Zhang
Tieniu Tan
15
0
0
18 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
Phantom: Subject-consistent video generation via cross-modal alignment
Phantom: Subject-consistent video generation via cross-modal alignment
Lijie Liu
Tianxiang Ma
Bingchuan Li
Zhuowei Chen
Jiawei Liu
Qian He
Xinglong Wu
Qian He
Xinglong Wu
DiffM
VGen
52
5
0
16 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
119
0
0
12 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
117
7
0
10 Feb 2025
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Li Hu
Guangyuan Wang
Zhen Shen
Xin Gao
Dechao Meng
Lian Zhuo
Peng Zhang
Bang Zhang
Liefeng Bo
DiffM
VGen
101
9
0
10 Feb 2025
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
Xinyu Liu
Ailing Zeng
Wei Xue
Harry Yang
Wenhan Luo
Qifeng Liu
Yike Guo
VGen
171
0
0
09 Feb 2025
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Yongfan Chen
Xiuwen Zhu
Tianyu Li
EGVM
VGen
56
3
0
08 Feb 2025
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
Yueying Zou
Peipei Li
Zekun Li
Huaibo Huang
Xing Cui
Xuannan Liu
Chenghanyu Zhang
Ran He
DeLMO
125
2
0
07 Feb 2025
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
Jinbo Xing
Long Mai
Cusuh Ham
Jiahui Huang
Aniruddha Mahapatra
Chi-Wing Fu
T. Wong
Feng Liu
DiffM
VGen
128
2
0
06 Feb 2025
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
Haibo Tong
Zhaoyang Wang
Zhengzhang Chen
Haonian Ji
Shi Qiu
...
Peng Xia
Mingyu Ding
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVM
VGen
102
2
0
03 Feb 2025
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
183
11
0
03 Feb 2025
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Yuefan Cao
Xiaoyu Li
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
92
7
0
02 Feb 2025
Consistent Video Colorization via Palette Guidance
Consistent Video Colorization via Palette Guidance
Han Wang
Yuang Zhang
Yuhong Zhang
Lingxiao Lu
Li-Na Song
DiffM
VGen
88
0
0
31 Jan 2025
Improving Tropical Cyclone Forecasting With Video Diffusion Models
Improving Tropical Cyclone Forecasting With Video Diffusion Models
Zhibo Ren
Pritthijit Nath
Pancham Shukla
41
0
0
27 Jan 2025
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Runyi Hu
Jingyang Zhang
Y. Li
Jiwei Li
Qing-Wu Guo
Han Qiu
Tianwei Zhang
WIGM
VGen
81
4
0
24 Jan 2025
PreciseCam: Precise Camera Control for Text-to-Image Generation
PreciseCam: Precise Camera Control for Text-to-Image Generation
Edurne Bernal-Berdun
Ana Serrano
B. Masiá
Matheus Gadelha
Yannick Hold-Geoffroy
Xin Sun
Diego F. F. Gutierrez
DiffM
VGen
47
0
0
22 Jan 2025
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving
Yu Yang
Jianbiao Mei
Yukai Ma
Siliang Du
Wenqing Chen
Yijie Qian
Yuxiang Feng
Yong-jin Liu
92
11
0
20 Jan 2025
MEt3R: Measuring Multi-View Consistency in Generated Images
MEt3R: Measuring Multi-View Consistency in Generated Images
Mohammad Asim
Christopher Wewer
Thomas Wimmer
Bernt Schiele
J. E. Lenssen
EGVM
3DGS
VGen
46
7
0
10 Jan 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
78
6
0
03 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
124
2
0
03 Jan 2025
RORem: Training a Robust Object Remover with Human-in-the-Loop
RORem: Training a Robust Object Remover with Human-in-the-Loop
Ruibin Li
Tao Yang
Song Guo
L. Zhang
53
3
0
01 Jan 2025
Previous
12345
Next