Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.08818
Cited By
v1
v2 (latest)
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"
50 / 273 papers shown
Title
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
103
91
0
09 May 2024
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
147
16
0
06 May 2024
TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models
Teng Zhou
Yongchuan Tang
DiffM
144
2
0
30 Apr 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Sicheng Xu
Guojun Chen
Yu-Xiao Guo
Jiaolong Yang
Chong Li
Zhenyu Zang
Yizhong Zhang
Xin Tong
Baining Guo
91
102
0
16 Apr 2024
Four-hour thunderstorm nowcasting using deep diffusion models of satellite
Kuai Dai
Xutao Li
Junying Fang
Yunming Ye
Demin Yu
Hui Su
Di Xian
Danyu Qin
Jingsong Wang
AI4Cl
133
2
0
16 Apr 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
169
18
0
16 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
169
36
0
07 Apr 2024
LidarDM: Generative LiDAR Simulation in a Generated World
Vlas Zyrianov
Henry Che
Zhijian Liu
Shenlong Wang
VGen
109
23
0
03 Apr 2024
Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery
Christian Limberg
Artur Gonçalves
Bastien Rigault
Helmut Prendinger
101
6
0
02 Apr 2024
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
Aimon Rahman
Malsha V. Perera
Vishal M. Patel
VGen
88
7
0
28 Mar 2024
Spectral Motion Alignment for Video Motion Transfer using Diffusion Models
Geon Yeong Park
Hyeonho Jeong
Sang Wan Lee
Jong Chul Ye
VGen
DiffM
80
12
0
22 Mar 2024
Explorative Inbetweening of Time and Space
Haiwen Feng
Zheng Ding
Zhihao Xia
Simon Niklaus
Victoria Fernandez-Abrevaya
Michael J. Black
Xuaner Zhang
DiffM
VGen
85
10
0
21 Mar 2024
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Roberto Henschel
Levon Khachatryan
Daniil Hayrapetyan
Hayk Poghosyan
Vahram Tadevosyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
VGen
239
89
0
21 Mar 2024
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
Yumeng Li
William H. Beluch
Margret Keuper
Dan Zhang
Anna Khoreva
DiffM
VGen
131
5
0
20 Mar 2024
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Linjiang Huang
Rongyao Fang
Aiping Zhang
Guanglu Song
Si Liu
Yu Liu
Hongsheng Li
DiffM
92
28
0
19 Mar 2024
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
Shuai Yang
Yifan Zhou
Ziwei Liu
Chen Change Loy
VGen
DiffM
131
33
0
19 Mar 2024
Generative Enhancement for 3D Medical Images
Lingting Zhu
Noel Codella
Dongdong Chen
Zhenchao Jin
Lu Yuan
Lequan Yu
DiffM
MedIm
95
10
0
19 Mar 2024
DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing
Hyeonho Jeong
Jinho Chang
Geon Yeong Park
Jong Chul Ye
DiffM
VGen
104
18
0
18 Mar 2024
CasSR: Activating Image Power for Real-World Image Super-Resolution
Haolan Chen
Jinhua Hao
Kai Zhao
Kun Yuan
Ming Sun
Chao Zhou
Wei Hu
99
5
0
18 Mar 2024
SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation
Hongjian Liu
Qingsong Xie
Zhijie Deng
Chen Chen
Shixiang Tang
Fueyang Fu
Zheng-Jun Zha
H. Lu
Zheng-jun Zha
114
9
0
03 Mar 2024
Accelerating Diffusion Sampling with Optimized Time Steps
Shuchen Xue
Zhaoqiang Liu
Fei Chen
Shifeng Zhang
Tianyang Hu
Enze Xie
Zhenguo Li
DiffM
147
29
0
27 Feb 2024
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
263
103
0
27 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
134
62
0
22 Feb 2024
VGMShield: Mitigating Misuse of Video Generative Models
Yan Pang
Yang Zhang
Yang Zhang
Tianhao Wang
119
3
0
20 Feb 2024
Denoising Diffusion via Image-Based Rendering
Titas Anciukevicius
Fabian Manhardt
Federico Tombari
Paul Henderson
138
13
0
05 Feb 2024
ActAnywhere: Subject-Aware Video Background Generation
Boxiao Pan
Zhan Xu
Chun-Hao Paul Huang
Krishna Kumar Singh
Yang Zhou
Leonidas Guibas
Jimei Yang
VGen
DiffM
61
3
0
19 Jan 2024
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGen
DiffM
83
39
0
17 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
163
1
0
15 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
291
280
0
05 Jan 2024
VASE: Object-Centric Appearance and Shape Manipulation of Real Videos
E. Peruzzo
Vidit Goel
Dejia Xu
Xingqian Xu
Yi Ding
Zhangyang Wang
Humphrey Shi
N. Sebe
LM&Ro
VGen
DiffM
126
12
0
04 Jan 2024
Towards Flexible, Scalable, and Adaptive Multi-Modal Conditioned Face Synthesis
Jingjing Ren
Cheng Xu
Haoyu Chen
Xinran Qin
Lei Zhu
CVBM
DiffM
99
4
0
26 Dec 2023
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Huan Ling
Seung Wook Kim
Antonio Torralba
Sanja Fidler
Karsten Kreis
DiffM
3DGS
89
123
0
21 Dec 2023
MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers
Haoyu Ma
Shahin Mahdizadehaghdam
Bichen Wu
Zhipeng Fan
Yuchao Gu
Wenliang Zhao
Lior Shapira
Xiaohui Xie
DiffM
VGen
66
4
0
19 Dec 2023
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
Akio Kodaira
Chenfeng Xu
Toshiki Hazama
Takanori Yoshimoto
Kohei Ohno
...
Soichi Sugano
Hanying Cho
Zhijian Liu
Kurt Keutzer
Kurt Keutzer
106
37
0
19 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
136
50
0
07 Dec 2023
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh
Jaehwan Jeong
Sieun Kim
Wonmin Byeon
Jinkyu Kim
Sungwoong Kim
Sangpil Kim
VGen
DiffM
114
23
0
07 Dec 2023
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models
Stathis Galanakis
Alexandros Lattas
Stylianos Moschoglou
Stefanos Zafeiriou
85
2
0
07 Dec 2023
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
Cong Wang
Jiaxi Gu
Panwen Hu
Songcen Xu
Hang Xu
Xiaodan Liang
VGen
109
16
0
05 Dec 2023
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
Sherwin Bahmani
Ivan Skorokhodov
Victor Rong
Gordon Wetzstein
Leonidas Guibas
Peter Wonka
Sergey Tulyakov
Jeong Joon Park
Andrea Tagliasacchi
David B. Lindell
DiffM
143
112
0
29 Nov 2023
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao
Tianyi Lu
Jiaxi Gu
Xing Zhang
Qingping Zheng
Zuxuan Wu
Hang Xu
Yu-Gang Jiang
VGen
DiffM
119
12
0
29 Nov 2023
A Unified Approach for Text- and Image-guided 4D Scene Generation
Yufeng Zheng
Xueting Li
Koki Nagano
Sifei Liu
Karsten Kreis
Otmar Hilliges
Shalini De Mello
108
49
0
28 Nov 2023
Flow-Guided Diffusion for Video Inpainting
Bohai Gu
Yongsheng Yu
Hengrui Fan
Libo Zhang
VGen
DiffM
102
12
0
26 Nov 2023
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
V.Ya. Arkhipkin
Zein Shaheen
Viacheslav Vasilev
E. Dakhova
Andrey Kuznetsov
Denis Dimitrov
DiffM
VGen
95
5
0
22 Nov 2023
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGen
DiffM
119
25
0
21 Nov 2023
MoVideo: Motion-Aware Video Generation with Diffusion Models
Christos Sakaridis
Yuchen Fan
Kai Zhang
Radu Timofte
Luc Van Gool
Rakesh Ranjan
DiffM
VGen
85
10
0
19 Nov 2023
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features
Chenfeng Xu
Huan Ling
Sanja Fidler
Or Litany
106
15
0
07 Nov 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Zhan Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffM
VGen
139
231
0
07 Nov 2023
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Xinyuan Chen
Yaohui Wang
Lingjun Zhang
Shaobin Zhuang
Xin Ma
Jiashuo Yu
Yali Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
DiffM
79
146
0
31 Oct 2023
CVPR 2023 Text Guided Video Editing Competition
Jay Zhangjie Wu
Xiuyu Li
Difei Gao
Zhen Dong
Jinbin Bai
...
Xu Cheng
Jie Tang
Mike Zheng Shou
Kurt Keutzer
Forrest N. Iandola
100
35
0
24 Oct 2023
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao
Yuchao Gu
Jay Zhangjie Wu
David Junhao Zhang
Jia-Wei Liu
Weijia Wu
Jussi Keppo
Mike Zheng Shou
DiffM
VGen
113
118
0
12 Oct 2023
Previous
1
2
3
4
5
6
Next