Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.08818
Cited By
v1
v2 (latest)
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
18 April 2023
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models"
50 / 273 papers shown
Title
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Gaojie Lin
Jianwen Jiang
Jiaqi Yang
Zerong Zheng
Chao Liang
DiffM
VGen
395
29
0
01 Jul 2025
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
Bulat Gabdullin
Nina Konovalova
Nikolay Patakin
Dmitry Senushkin
Anton Konushin
MDE
82
1
0
01 Jul 2025
Emergent Temporal Correspondences from Video Diffusion Transformers
Jisu Nam
Soowon Son
Dahyun Chung
Jiyoung Kim
Siyoon Jin
Junhwa Hur
Seungryong Kim
VGen
32
0
0
20 Jun 2025
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
Jiaqi Li
Junshu Tang
Zhiyong Xu
Longhuang Wu
Yuan Zhou
Shuai Shao
Tianbao Yu
Zhiguo Cao
Qinglin Lu
DiffM
VGen
22
0
0
20 Jun 2025
VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge
Zijing Zhao
Kai Wang
Hao-Ming Huang
Ying Hu
Liang He
J. Yang
24
0
0
19 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
58
0
0
18 Jun 2025
X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
Yu Yang
Alan Liang
Jianbiao Mei
Yukai Ma
Yong-Jin Liu
Gim Hee Lee
VGen
34
0
0
16 Jun 2025
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Yuan Gao
Mattia Piccinini
Yuchen Zhang
Dingrui Wang
Korbinian Moller
...
Steven Peters
Andrea Stocco
Bassam Alrifaee
Marco Pavone
Johannes Betz
30
0
0
13 Jun 2025
DAVID-XR1: Detecting AI-Generated Videos with Explainable Reasoning
Yifeng Gao
Yifan Ding
Hongyu Su
Juncheng Li
Yunhan Zhao
...
Li Wang
Xin Wang
Yixu Wang
Xingjun Ma
Yu-Gang Jiang
VGen
19
0
0
13 Jun 2025
The Diffusion Duality
Subham S. Sahoo
Justin Deschenaux
Aaron Gokaslan
Guanghan Wang
Justin T Chiu
Volodymyr Kuleshov
DiffM
128
4
0
12 Jun 2025
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation
Zhiyang Xu
Jiuhai Chen
Zhaojiang Lin
Xichen Pan
Lifu Huang
...
Di Jin
Michihiro Yasunaga
Lili Yu
Xi Lin
Shaoliang Nie
125
1
0
12 Jun 2025
Geometric Regularity in Deterministic Sampling of Diffusion-based Generative Models
Defang Chen
Zhenyu Zhou
C. Wang
Siwei Lyu
DiffM
69
0
0
11 Jun 2025
SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score
Mohammad Jalali
Haoyu Lei
Amin Gohari
Farzan Farnia
DiffM
66
0
0
11 Jun 2025
NnD: Diffusion-based Generation of Physically-Nonnegative Objects
Nadav Torem
Tamar Sde-Chen
Y. Schechner
DiffM
73
0
0
11 Jun 2025
From Pixels to Graphs: using Scene and Knowledge Graphs for HD-EPIC VQA Challenge
Agnese Taluzzi
Davide Gesualdi
Riccardo Santambrogio
Chiara Plizzari
Francesca Palermo
S. Mentasti
Matteo Matteucci
GNN
53
2
0
10 Jun 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li
Yutong Chen
Yiqian Wu
Kaifeng Zhao
Marc Pollefeys
Siyu Tang
EgoV
VLM
44
0
0
09 Jun 2025
NOVA3D: Normal Aligned Video Diffusion Model for Single Image to 3D Generation
Yuxiao Yang
Peihao Li
Yuhong Zhang
Junzhe Lu
Xianglong He
Minghan Qin
Weitao Wang
Haoqian Wang
DiffM
VGen
25
0
0
09 Jun 2025
TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation
M. Kim
Dongjin Kim
Seokju Yun
Jaegul Choo
DiffM
VGen
35
0
0
08 Jun 2025
Noise Consistency Regularization for Improved Subject-Driven Image Synthesis
Yao Ni
Song Wen
Piotr Koniusz
A. Cherian
23
0
0
06 Jun 2025
ContentV: Efficient Training of Video Generation Models with Limited Compute
Wenfeng Lin
Renjie Chen
Boyuan Liu
Shiyue Yan
Ruoyu Feng
...
Chao Feng
Jiao Ran
Qi Wu
Zuotao Liu
Mingyu Guo
VGen
117
0
0
05 Jun 2025
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion
Akide Liu
Zeyu Zhang
Zhexin Li
Xuehai Bai
Yizeng Han
...
Jiahao He
Yuanyu He
F. Wang
Gholamreza Haffari
Bohan Zhuang
VGen
MQ
148
1
0
05 Jun 2025
Perfecting Depth: Uncertainty-Aware Enhancement of Metric Depth
Jinyoung Jun
Lei Chu
Jiahao Li
Yan Lu
Chang-Su Kim
MDE
144
0
0
05 Jun 2025
DualX-VSR: Dual Axial Spatial
×
\times
×
Temporal Transformer for Real-World Video Super-Resolution without Motion Compensation
Shuo Cao
Yihao Liu
Xiaohui Li.Yuanting Gao.Yu Zhou
Yuanting Gao
Yu Zhou
Chao Dong
115
0
0
05 Jun 2025
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu
Qichao Wang
H. Cao
Xiaoyin Xu
Min Zhang
57
0
0
03 Jun 2025
DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation
Zhengyao Lv
Chenyang Si
Tianlin Pan
Zhaoxi Chen
Kwan-Yee K. Wong
Yu Qiao
Ziwei Liu
DiffM
VGen
50
0
0
03 Jun 2025
OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation
Sen Liang
Zhentao Yu
Zhengguang Zhou
Teng Hu
Hongmei Wang
...
Qin Lin
Yuan Zhou
Xin Li
Qinglin Lu
Zhibo Chen
DiffM
VGen
SyDa
58
0
0
02 Jun 2025
Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks
Tao Yang
Ruibin Li
Yangming Shi
Yuqi Zhang
Qide Dong
Haoran Cheng
Weiguo Feng
Shilei Wen
Bingyue Peng
Lei Zhang
DiffM
VGen
73
0
0
02 Jun 2025
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
Geunmin Hwang
Hyun-kyu Ko
Younghyun Kim
S. W. Lee
Eunbyung Park
VGen
56
0
0
02 Jun 2025
ATI: Any Trajectory Instruction for Controllable Video Generation
Angtian Wang
Haibin Huang
Jacob Zhiyuan Fang
Yiding Yang
Chongyang Ma
DiffM
VGen
79
0
0
28 May 2025
Autoregression-free video prediction using diffusion model for mitigating error propagation
Woonho Ko
Jin Bok Park
Il Yong Chun
DiffM
VGen
48
0
0
28 May 2025
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
Anthony Chen
Wenzhao Zheng
Yida Wang
Xueyang Zhang
Kun Zhan
Peng Jia
Kurt Keutzer
Shanghang Zhang
105
1
0
28 May 2025
Advancing high-fidelity 3D and Texture Generation with 2.5D latents
Xin Yang
Jiantao Lin
Yingjie Xu
Haodong Li
Yingcong Chen
3DV
66
0
0
27 May 2025
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
Dar-Yen Chen
Hmrishav Bandyopadhyay
Kai Zou
Yi-Zhe Song
56
0
0
27 May 2025
Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance
Badr Moufad
Yazid Janati
Alain Durmus
Ahmed Ghorbel
Eric Moulines
Jimmy Olsson
DiffM
82
0
0
27 May 2025
Frame-Level Captions for Long Video Generation with Complex Multi Scenes
Guangcong Zheng
Jianlong Yuan
Bo Wang
Haoyang Huang
Guoqing Ma
Nan Duan
DiffM
VGen
81
0
0
27 May 2025
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Ying Shan
Li Yuan
VGen
81
0
0
27 May 2025
HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters
Yi Chen
Sen Liang
Zixiang Zhou
Ziyao Huang
Yifeng Ma
Junshu Tang
Qin Lin
Yuan Zhou
Qinglin Lu
VGen
54
0
0
26 May 2025
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos
Xiaodong Wang
Peixi Peng
VGen
1.1K
1
0
24 May 2025
One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion
Yahao Fan
Tianxiang Gui
Kaiyang Ji
Shutong Ding
C. Zhang
Jiayuan Gu
Jingyi Yu
Jingya Wang
Ye-ling Shi
VGen
97
0
0
24 May 2025
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
N. Benjamin Erichson
Vinicius Mikuni
Dongwei Lyu
Yang Gao
Omri Azencot
Soon Hoe Lim
Michael W. Mahoney
AI4CE
898
0
0
23 May 2025
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
Yanting Miao
William Loh
Suraj Kothawade
Pacal Poupart
47
0
0
23 May 2025
Temporal Differential Fields for 4D Motion Modeling via Image-to-Video Synthesis
Xin You
Minghui Zhang
Hanxiao Zhang
J. Yang
Nassir Navab
DiffM
VGen
MedIm
240
0
0
22 May 2025
Learning to Integrate Diffusion ODEs by Averaging the Derivatives
Wenze Liu
Xiangyu Yue
84
0
0
20 May 2025
A Challenge to Build Neuro-Symbolic Video Agents
Sahil Shah
Harsh Goel
Sai Shankar Narasimhan
Minkyu Choi
S P Sharan
Oguzhan Akcin
Sandeep Chinchali
AI4TS
78
0
0
20 May 2025
Robust Planning for Autonomous Driving via Mixed Adversarial Diffusion Predictions
Albert Zhao
Stefano Soatto
DiffM
140
0
0
18 May 2025
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
140
2
0
12 May 2025
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Panwen Hu
Jiehui Huang
Qiang Sun
Xiaodan Liang
DiffM
VGen
109
0
0
11 May 2025
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
Andreas Sochopoulos
Nikolay Malkin
Nikolaos Tsagkas
João Moura
Michael Gienger
S. Vijayakumar
90
1
0
02 May 2025
Direct Motion Models for Assessing Generated Videos
Kelsey R. Allen
Carl Doersch
Guangyao Zhou
Mohammed Suhail
Danny Driess
...
Thomas Kipf
Mehdi S. M. Sajjadi
Kevin P. Murphy
João Carreira
Sjoerd van Steenkiste
EGVM
DiffM
VGen
168
0
0
30 Apr 2025
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
Kuanting Wu
Kei Ota
Asako Kanezaki
DiffM
VGen
120
0
0
20 Apr 2025
1
2
3
4
5
6
Next