ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03048
  4. Cited By
Latte: Latent Diffusion Transformer for Video Generation

Latte: Latent Diffusion Transformer for Video Generation

5 January 2024
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Latte: Latent Diffusion Transformer for Video Generation"

50 / 271 papers shown
Title
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
Ruichen Chen
Keith G. Mills
Liyao Jiang
Chao Gao
Di Niu
VGen
73
0
0
28 May 2025
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Ying Shan
Li Yuan
VGen
51
0
0
27 May 2025
MotionPro: A Precise Motion Controller for Image-to-Video Generation
MotionPro: A Precise Motion Controller for Image-to-Video Generation
Zhongwei Zhang
Fuchen Long
Zhaofan Qiu
Yingwei Pan
Wu Liu
Ting Yao
Tao Mei
DiffM
VGen
62
1
0
26 May 2025
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
Dong Liu
Jiayi Zhang
Yifan Li
Yanxuan Yu
Ben Lengerich
Ying Nian Wu
51
1
0
26 May 2025
Training-free Stylized Text-to-Image Generation with Fast Inference
Training-free Stylized Text-to-Image Generation with Fast Inference
X. Ma
Yaohui Wang
Xinyuan Chen
Tien-Tsin Wong
C. L. P. Chen
758
0
0
25 May 2025
DVD-Quant: Data-free Video Diffusion Transformers Quantization
DVD-Quant: Data-free Video Diffusion Transformers Quantization
Zhiteng Li
Hanxuan Li
Junyi Wu
Kai Liu
Linghe Kong
Guihai Chen
Yulun Zhang
Xiaokang Yang
MQ
VGen
59
0
0
24 May 2025
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
FLEX: A Backbone for Diffusion-Based Modeling of Spatio-temporal Physical Systems
N. Benjamin Erichson
Vinicius Mikuni
Dongwei Lyu
Yang Gao
Omri Azencot
Soon Hoe Lim
Michael W. Mahoney
AI4CE
860
0
0
23 May 2025
Training-Free Efficient Video Generation via Dynamic Token Carving
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang
Jinbo Xing
Bin Xia
Shaoteng Liu
Bohao Peng
Xin Tao
Pengfei Wan
Eric Lo
Jiaya Jia
DiffM
VGen
55
0
0
22 May 2025
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
Ahmet Berke Gokmen
Yigit Ekin
Bahri Batuhan Bilecen
Aysegül Dündar
78
0
0
19 May 2025
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
Dian Shao
Mingfei Shi
Shengda Xu
Haodong Chen
Yongle Huang
Binglu Wang
3DH
60
0
0
19 May 2025
SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
Junbo Wang
Haofeng Tan
Bowen Liao
Albert Jiang
Teng Fei
Qixing Huang
Zhengzhong Tu
Shan Ye
Yuhao Kang
83
0
0
19 May 2025
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
Zihan Su
Xuerui Qiu
Hongbin Xu
Tangyu Jiang
Junhao Zhuang
Chun Yuan
Ming Li
Shengfeng He
Fei Richard Yu
WIGM
63
0
0
19 May 2025
Video-GPT via Next Clip Diffusion
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffM
VGen
211
0
0
18 May 2025
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong
Xingye Tian
Boyuan Jiang
Xuebo Wang
Xin Tao
Pengfei Wan
Zhiwei Zhang
64
0
0
17 May 2025
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
Jiarui Wang
Huiyu Duan
Ziheng Jia
Yu Zhao
Woo Yi Yang
...
Zhongfu Chen
Juntong Wang
Yuke Xing
Guangtao Zhai
Xiongkuo Min
VGen
50
1
0
17 May 2025
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Haipeng Fang
Sheng Tang
Juan Cao
Enshuo Zhang
Fan Tang
Tong-Yee Lee
69
0
0
16 May 2025
Generative Pre-trained Autoregressive Diffusion Transformer
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
93
1
0
12 May 2025
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models
Ozgur Kara
Krishna Kumar Singh
Feng Liu
Duygu Ceylan
James M. Rehg
Tobias Hinz
DiffM
VGen
70
0
0
12 May 2025
Video Dataset Condensation with Diffusion Models
Video Dataset Condensation with Diffusion Models
Zhe Li
Hadrien Reynaud
Mischa Dombrowski
Sarah Cechnicka
Franciskus Xaverius Erick
Bernhard Kainz
DD
VGen
69
0
0
10 May 2025
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
Shiyi Zhang
Junhao Zhuang
Zhaoyang Zhang
Ying Shan
Yansong Tang
VGen
139
0
0
06 May 2025
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes
Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung
DiffM
AI4TS
91
0
0
29 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Yuchen Li
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
126
8
0
26 Apr 2025
Latent Video Dataset Distillation
Latent Video Dataset Distillation
Ning Li
Antai Andy Liu
Jingran Zhang
Justin Cui
DD
VGen
113
0
0
23 Apr 2025
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
Yingjie Xi
Jiangning Zhang
Xiaosong Yang
72
0
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
Xuzhao Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Yanzhe Zhang
Ji Wan
Jiadong Wang
VGen
91
1
0
22 Apr 2025
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
Weijie He
Mushui Liu
YunLong Yu
Zhao Wang
Chao Wu
DiffM
VGen
116
0
0
21 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
121
0
0
16 Apr 2025
WORLDMEM: Long-term Consistent World Simulation with Memory
WORLDMEM: Long-term Consistent World Simulation with Memory
Zeqi Xiao
Yushi Lan
Yifan Zhou
Wenqi Ouyang
Shuai Yang
Yanhong Zeng
Xingang Pan
117
2
0
16 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
Jinfeng Xu
Yuanmin Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Yuhui Zhang
Rui Feng
Weidi Xie
DiffM
72
3
0
16 Apr 2025
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
Yongsheng Yu
Haitian Zheng
Zhifei Zhang
Jianming Zhang
Yuqian Zhou
Connelly Barnes
Yixiao Liu
Wei Xiong
Zhe Lin
Jiebo Luo
95
0
0
11 Apr 2025
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Jialu Li
Shoubin Yu
Han Lin
Jaemin Cho
Jaehong Yoon
Joey Tianyi Zhou
DiffM
VGen
86
1
0
11 Apr 2025
Cellular Development Follows the Path of Minimum Action
Cellular Development Follows the Path of Minimum Action
Rohola Zandie
Farhan Khodaee
Yufan Xia
Elazer R. Edelman
68
0
0
10 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
124
0
0
09 Apr 2025
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
E. Peruzzo
Dejia Xu
Xingqian Xu
Humphrey Shi
N. Sebe
DiffM
VGen
78
0
0
09 Apr 2025
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
Boyuan Wang
Runqi Ouyang
Xiaofeng Wang
Zheng Zhu
Guosheng Zhao
Chaojun Ni
Guan Huang
Lihong Liu
Xingang Wang
3DGS
114
0
0
04 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
Didong Li
Di Qiu
Jiadong Wang
Yikun Dou
...
Jinfeng Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
109
8
0
03 Apr 2025
OmniCam: Unified Multimodal Video Generation via Camera Control
OmniCam: Unified Multimodal Video Generation via Camera Control
Xiaoda Yang
Jiayang Xu
Kaixuan Luan
Xinyu Zhan
Hongshun Qiu
...
Shuai Yang
Li Zhang
Checheng Yu
Cewu Lu
Lixin Yang
DiffM
VGen
85
0
0
03 Apr 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
81
2
0
31 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
Bernhard Kainz
MedIm
68
1
0
28 Mar 2025
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu
Kuofeng Gao
Yang Bai
Jinmin Li
Jinxiao Shan
Tao Dai
Shu-Tao Xia
AAML
98
1
0
26 Mar 2025
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Haiyu Zhang
Xinyuan Chen
Yaohui Wang
Xihui Liu
Yunhong Wang
Yu Qiao
VGen
90
1
0
25 Mar 2025
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
Yufei Cai
Hu Han
Yuxiang Wei
Shiguang Shan
Xilin Chen
DiffM
VGen
74
0
0
25 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
137
5
0
25 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
101
2
0
23 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Yizhou Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
81
0
0
21 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Yixiao Liu
DiffM
VGen
93
0
0
21 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
114
2
0
20 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
...
Zelin Peng
Junjun He
Junjun He
Zongyuan Ge
Imran Razzak
DiffM
VGen
206
2
0
20 Mar 2025
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers
Hui Zhang
Tingwei Gao
Jie Shao
Zuxuan Wu
97
2
0
20 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
111
0
0
18 Mar 2025
123456
Next