Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.08818
Cited By
v1
v2 (latest)
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"
50 / 273 papers shown
Title
Generalization through variance: how noise shapes inductive biases in diffusion models
John J. Vastola
DiffM
486
5
0
16 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Fangyin Wei
VGen
MDE
125
1
0
15 Apr 2025
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise
Chao Liu
Arash Vahdat
DiffM
VGen
97
2
0
14 Apr 2025
SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
Bernhard Kainz
Stefanos Zafeiriou
DiffM
101
0
0
14 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
186
22
0
11 Apr 2025
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
Zeren Jiang
Chuanxia Zheng
Iro Laina
Diane Larlus
Andrea Vedaldi
VGen
97
2
0
10 Apr 2025
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo
Matthew Wallingford
Ali Farhadi
Noah Snavely
Wei-Chiu Ma
VGen
154
1
0
10 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
You Li
Jing Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
220
0
0
07 Apr 2025
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu
Hongbo Fu
Xintao Wang
Weicai Ye
Pengfei Wan
Di Zhang
Lin Gao
DiffM
VGen
138
0
0
30 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
Bernhard Kainz
MedIm
84
2
0
28 Mar 2025
HOT: Hadamard-based Optimized Training
Seonggon Kim
Juncheol Shin
Seung-taek Woo
Eunhyeok Park
119
0
0
27 Mar 2025
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
157
1
0
26 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
184
1
0
26 Mar 2025
Adapting Video Diffusion Models for Time-Lapse Microscopy
Alexander Holmberg
Nils Mechtel
Wei Ouyang
DiffM
VGen
112
0
0
24 Mar 2025
DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation
R. Vidaurre
Elena Garces
Dan Casas
DiffM
AI4CE
134
1
0
24 Mar 2025
UniCoRN: Latent Diffusion-based Unified Controllable Image Restoration Network across Multiple Degradations
Debabrata Mandal
Soumitri Chattopadhyay
Guansen Tong
Praneeth Chakravarthula
DiffM
105
1
0
20 Mar 2025
SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation
Chun-Han Yao
Yiming Xie
Vikram S. Voleti
Huaizu Jiang
Varun Jampani
3DGS
VGen
144
1
0
20 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
...
Zelin Peng
Junjun He
Junjun He
Zongyuan Ge
Imran Razzak
DiffM
VGen
306
2
0
20 Mar 2025
Advances in 4D Generation: A Survey
Qiaowei Miao
Kehan Li
Jinsheng Quan
Zhiyuan Min
Shaojie Ma
Yichao Xu
Yi Yang
Yawei Luo
150
2
0
18 Mar 2025
SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
Yucheng Mao
Boyang Wang
Nilesh Kulkarni
Jeong Joon Park
DiffM
112
0
0
18 Mar 2025
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
113
0
0
17 Mar 2025
DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving
Tao Wang
Cong Zhang
Xingguang Qu
Kun Li
Wen Liu
Chenyu Huang
117
1
0
15 Mar 2025
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang
Qixiang Zhang
Lehan Wang
Xuanqi Huang
Xiaomeng Li
VOS
VGen
104
0
0
14 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
116
0
0
13 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Fu Liu
Peng Jia
Xianpeng Lang
Xiaolong Sun
VGen
439
1
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
121
2
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
494
2
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
135
0
0
11 Mar 2025
LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
Quanjian Song
Zhihang Lin
Zhanpeng Zeng
Ziyue Zhang
Liujuan Cao
Rongrong Ji
VGen
131
1
0
09 Mar 2025
Generative Video Bi-flow
Chen Liu
Tobias Ritschel
DiffM
VGen
100
0
0
09 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
130
2
0
08 Mar 2025
Text2Story: Advancing Video Storytelling with Text Guidance
Taewon Kang
D. Kothandaraman
Ming C. Lin
DiffM
VGen
136
2
0
08 Mar 2025
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Mark YU
Wenbo Hu
Jinbo Xing
Ying Shan
VGen
154
12
0
07 Mar 2025
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video
Yue Gao
Hong-Xing Yu
Bo Zhu
Jiajun Wu
VGen
119
2
0
06 Mar 2025
How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects
Wonkwang Lee
Jongwon Jeong
Taehong Moon
Hyeon-Jong Kim
Jaehyeon Kim
Gunhee Kim
Byeong-Uk Lee
DiffM
149
0
0
06 Mar 2025
KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation
Antoni Bigata
Michał Stypułkowski
Rodrigo Mira
Stella Bounareli
Konstantinos Vougioukas
Zoe Landgraf
Nikita Drobyshev
Maciej Ziȩba
Stavros Petridis
Maja Pantic
DiffM
VGen
157
2
0
03 Mar 2025
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang
Shaobin Zhuang
Canmiao Fu
Binxin Yang
Ying Zhang
Chong Sun
Zhizheng Zhang
Yali Wang
Chen Li
Zheng-Jun Zha
DiffM
123
3
0
03 Mar 2025
A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning
Shashank Gupta
Chaitanya Ahuja
Tsung-Yu Lin
Sreya Dutta Roy
Harrie Oosterhuis
Maarten de Rijke
Satya Narayan Shukla
119
2
0
02 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
499
0
0
01 Mar 2025
EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning
Xuehao Gao
Yang Yang
Shaoyi Du
Yang Wu
Y. Liu
Guo-Jun Qi
96
1
0
01 Mar 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
129
1
0
28 Feb 2025
BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance
Xin Ye
Burhaneddin Yaman
Sheng Cheng
Feng Tao
Abhirup Mallik
Liu Ren
DiffM
117
2
0
27 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
151
8
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
181
2
0
24 Feb 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
152
0
0
23 Feb 2025
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
Yunlong Yuan
Yuanfan Guo
Chunwei Wang
Wei Zhang
Hang Xu
L. Zhang
DiffM
VGen
215
3
0
20 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
539
3
0
20 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
119
0
0
18 Feb 2025
MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching
Yen-Siang Wu
Chi-Pin Huang
Fu-En Yang
Yu-Jie Wang
DiffM
VGen
125
1
0
18 Feb 2025
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
Jingcheng Ni
Yuxin Guo
Yichen Liu
Rui Chen
Lewei Lu
Z. Wu
DiffM
VGen
144
5
0
17 Feb 2025
Previous
1
2
3
4
5
6
Next