Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.03048
Cited By
Latte: Latent Diffusion Transformer for Video Generation
5 January 2024
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Latte: Latent Diffusion Transformer for Video Generation"
50 / 271 papers shown
Title
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
85
0
0
17 Mar 2025
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Tongxuan Tian
Haoyang Li
Bo Ai
Xiaodi Yuan
Zhiao Huang
H. Su
DiffM
AI4CE
97
3
0
15 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
145
11
0
13 Mar 2025
Semantic Latent Motion for Portrait Video Generation
Qiyuan Zhang
Chenyu Wu
Wenzhang Sun
Huaize Liu
Donglin Di
Wei Chen
Changqing Zou
VGen
87
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Ping Luo
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
115
15
0
13 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
120
0
0
13 Mar 2025
Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Luozheng Qin
Zhiyu Tan
Mengping Yang
Xiaomeng Yang
Hao Li
143
0
0
12 Mar 2025
TPDiff: Temporal Pyramid Video Diffusion Model
L. Ran
Mike Zheng Shou
108
0
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
96
1
0
12 Mar 2025
VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion
Lehan Yang
Jincen Song
Tianlong Wang
Daiqing Qi
Weili Shi
Yuheng Liu
Sheng Li
DiffM
VOS
VGen
102
0
0
11 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
Qingbin Liu
DiffM
VGen
112
3
0
10 Mar 2025
DreamRelation: Relation-Centric Video Customization
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Biao Gong
Longxiang Tang
...
Haonan Qiu
Hengjia Li
Shuai Tan
Yize Zhang
Hongming Shan
VGen
97
1
0
10 Mar 2025
TR-DQ: Time-Rotation Diffusion Quantization
Yihua Shao
Deyang Lin
Fanhu Zeng
Minxi Yan
Hao Fei
...
Haozhe Wang
Jiaxin Guo
Yan Wang
Haotong Qin
Hao Tang
MQ
DiffM
116
3
0
09 Mar 2025
An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
Yuanmin Huang
Jilan Xu
Baoqi Pei
Yuping He
Guo Chen
...
Xinyuan Chen
Yaohui Wang
Yali Wang
Yu Qiao
Limin Wang
109
2
0
06 Mar 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao
Weijia Mao
Mike Zheng Shou
89
0
0
05 Mar 2025
GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
Zhun Mou
Bin Xia
Zhengchao Huang
Wenming Yang
Jiaya Jia
VGen
ELM
LRM
87
1
0
04 Mar 2025
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
Wenhao Wang
Yue Yang
DiffM
VGen
139
0
0
03 Mar 2025
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
Zhiyu Tan
Junyan Wang
Hao Yang
Luozheng Qin
Hesen Chen
Qiang-feng Zhou
Hao Li
VGen
111
1
0
28 Feb 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
106
1
0
27 Feb 2025
SpecDM: Hyperspectral Dataset Synthesis with Pixel-level Semantic Annotations
Wen Liu
Pei Yang
Wenhui Hong
Xiaoguang Mei
Jiayi Ma
DiffM
79
0
0
24 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
87
0
0
18 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
90
3
0
17 Feb 2025
Learning Human Skill Generators at Key-Step Levels
Yilu Wu
Chenhui Zhu
Shuai Wang
Hanlin Wang
Jing Wang
Zhaoxiang Zhang
Limin Wang
VGen
156
0
0
12 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
159
14
0
10 Feb 2025
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Hangliang Ding
Dacheng Li
Runlong Su
Peiyuan Zhang
Zhijie Deng
Ion Stoica
Hao Zhang
VGen
106
7
0
10 Feb 2025
Pre-Trained Video Generative Models as World Simulators
Haoran He
Yang Zhang
Liang Lin
Zhihao Xu
Ling Pan
VGen
96
4
0
10 Feb 2025
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
Xinyu Liu
Ailing Zeng
Wei Xue
Harry Yang
Wenhan Luo
Qifeng Liu
Yike Guo
VGen
264
1
0
09 Feb 2025
IPO: Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang
Zhiyu Tan
Xuecheng Nie
VGen
127
3
0
04 Feb 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
78
7
0
21 Jan 2025
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Sungbin Kim
Hyunwuk Lee
Wonho Cho
Mincheol Park
Won Woo Ro
111
1
0
20 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
167
3
0
03 Jan 2025
AKiRa: Augmentation Kit on Rays for optical video generation
Xi Wang
Robin Courant
Marc Christie
Vicky Kalogeiton
VGen
151
3
0
31 Dec 2024
Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Yuanmin Huang
Jilan Xu
Baoqi Pei
Yuping He
Guo Chen
...
Kunpeng Li
C. Yuan
Yidan Wang
Yu Qiao
L. Wang
114
6
0
31 Dec 2024
Bridging Interpretability and Robustness Using LIME-Guided Model Refinement
Navid Nayyem
Abdullah Rakin
Longwei Wang
AAML
FAtt
89
2
0
25 Dec 2024
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
160
1
0
16 Dec 2024
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
Hao Shao
Shulun Wang
Yang Zhou
Guanglu Song
Dailan He
Shuo Qin
Zhuofan Zong
Bingqi Ma
Yang Liu
Hongsheng Li
VGen
DiffM
126
0
0
15 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
329
3
0
14 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhicheng Zhao
Fei Su
VGen
101
0
0
12 Dec 2024
T-SVG: Text-Driven Stereoscopic Video Generation
Qiao Jin
Xiaodong Chen
Wu Liu
Tao Mei
Yongdong Zhang
DiffM
VGen
102
2
0
12 Dec 2024
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Zixuan Ye
Huijuan Huang
Xintao Wang
Pengfei Wan
Di Zhang
Wenhan Luo
DiffM
VGen
120
4
0
10 Dec 2024
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
174
5
0
09 Dec 2024
Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
Gongfan Fang
Xinyin Ma
Xinchao Wang
DiffM
MoE
145
1
0
07 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Ying Shan
DiffM
VGen
163
0
0
05 Dec 2024
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
Yuelei Wang
Jian Zhang
Pengtao Jiang
Hao Zhang
Jinwei Chen
Bo Li
VGen
DiffM
142
4
0
02 Dec 2024
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
Chaojun Ni
Guosheng Zhao
Xiaofeng Wang
Zheng Hua Zhu
Wenkang Qin
...
Kun Zhan
Peng Jia
Xianpeng Lang
Xingang Wang
Wenjun Mei
VGen
321
9
0
29 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
125
5
0
29 Nov 2024
Open-Sora Plan: Open-Source Large Video Generation Model
Bin Lin
Yunyang Ge
Xinhua Cheng
Zongjian Li
Bin Zhu
...
Zhang Pan
Xing Zhou
Shaoling Dong
Yonghong Tian
Li-xin Yuan
VLM
VGen
159
80
0
28 Nov 2024
SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Rong-Cheng Tu
Wenhao Sun
Zhao Jin
Jingyi Liao
Jiaxing Huang
Dacheng Tao
VGen
DiffM
142
6
0
28 Nov 2024
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Feng Liu
Shiwei Zhang
Xiaofeng Wang
Yujie Wei
Haonan Qiu
Yuzhong Zhao
Yingya Zhang
Qixiang Ye
Fang Wan
VGen
AI4TS
166
22
0
28 Nov 2024
StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu
Zhen Xing
Xintong Han
Zhi-Qi Cheng
Qi Dai
Chong Luo
Zuxuan Wu
VGen
157
21
0
26 Nov 2024
Previous
1
2
3
4
5
6
Next