Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06072
Cited By
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
12 August 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
Jiazheng Xu
Yuanming Yang
Wenyi Hong
Xiaohan Zhang
Guanyu Feng
Da Yin
Yuxuan Zhang
Weihan Wang
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer"
50 / 309 papers shown
Title
Mask
2
^2
2
DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi
Jianlong Yuan
Wanquan Feng
Shancheng Fang
Jiawei Liu
Siyu Zhou
Qian He
Hongtao Xie
Yongdong Zhang
DiffM
VGen
44
0
0
25 Mar 2025
Video-T1: Test-Time Scaling for Video Generation
F. Liu
Hanyang Wang
Yimo Cai
Kaiyan Zhang
Xiaohang Zhan
Yueqi Duan
DiffM
VGen
78
1
0
24 Mar 2025
Target-Aware Video Diffusion Models
Taeksoo Kim
Hanbyul Joo
DiffM
VGen
91
1
0
24 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
62
3
0
24 Mar 2025
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
Weichen Fan
Amber Yijia Zheng
Raymond A. Yeh
Ziwei Liu
55
1
0
24 Mar 2025
Can Text-to-Video Generation help Video-Language Alignment?
Luca Zanella
Massimiliano Mancini
Willi Menapace
Sergey Tulyakov
Yiming Wang
Elisa Ricci
DiffM
VGen
57
0
0
24 Mar 2025
Aether: Geometric-Aware Unified World Modeling
Aether Team
Haoyi Zhu
Yanjie Wang
Jianjun Zhou
Wenzheng Chang
...
Zizun Li
Junyi Chen
Chunhua Shen
Jiangmiao Pang
Tong He
DiffM
VGen
62
2
0
24 Mar 2025
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian
Xin Xia
Yuxi Ren
Shanchuan Lin
Xing Wang
Xuefeng Xiao
Yunhai Tong
L. Yang
Bin Cui
60
0
0
24 Mar 2025
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
Guosheng Zhao
Xiaofeng Wang
Chaojun Ni
Zheng Zhu
Wenkang Qin
Guan Huang
Xingang Wang
73
1
0
24 Mar 2025
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li
Hossein Rahmani
Qiuhong Ke
Xiaozhong Liu
DiffM
VGen
VLM
61
0
0
23 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
44
0
0
22 Mar 2025
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
Qingyu Shi
Jianzong Wu
Jinbin Bai
Jun Zhang
Lu Qi
Xiaomeng Li
Yunhai Tong
48
0
0
21 Mar 2025
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xintao Wang
Pengfei Wan
Di Zhang
Xihui Liu
VGen
45
1
0
21 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Yizhou Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
56
0
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
65
0
0
21 Mar 2025
ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos
Haolin Yang
Feilong Tang
Ming Hu
Yulong Li
Junjie Guo
Yexin Liu
Zelin Peng
Junjun He
Zongyuan Ge
VGen
DiffM
98
1
0
20 Mar 2025
Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
Marc Benedí San Millán
Angela Dai
Matthias Nießner
DiffM
69
0
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
66
0
0
20 Mar 2025
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
Hyojun Go
Byeongjun Park
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
3DGS
VGen
99
1
0
20 Mar 2025
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang
Munan Ning
Zheyuan Liu
Yanbo Wang
Jiayi Ye
Yue Huang
Shuo Yang
Xiao Chen
Y. Song
Li Yuan
LRM
58
0
0
19 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
84
2
0
19 Mar 2025
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam
Soowon Son
Zhan Xu
Jing Shi
Difan Liu
Feng Liu
Aashish Misraa
Seungryong Kim
Yang Zhou
DiffM
51
0
0
19 Mar 2025
Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
Jin Wang
Chenghui Lv
Xian Li
Shichao Dong
Huadong Li
Kelu Yao
Chao Li
Wenqi Shao
Ping Luo
62
1
0
19 Mar 2025
Concat-ID: Towards Universal Identity-Preserving Video Synthesis
Yong Zhong
Zhuoyi Yang
Jiayan Teng
Xiaotao Gu
Chongxuan Li
VGen
63
0
0
18 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
94
0
0
18 Mar 2025
Fast Autoregressive Video Generation with Diagonal Decoding
Yang Ye
Junliang Guo
Haoyu Wu
Tianyu He
Tim Pearce
Tabish Rashid
Katja Hofmann
Jiang Bian
DiffM
VGen
81
1
0
18 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
68
25
0
18 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
94
0
0
17 Mar 2025
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
Shitong Shao
Hongwei Yi
Hanzhong Guo
Tian Ye
Daquan Zhou
Michael Lingelbach
Zhiqiang Xu
Zeke Xie
VGen
52
0
0
17 Mar 2025
TACO: Taming Diffusion for in-the-wild Video Amodal Completion
Ruijie Lu
Yixin Chen
Yu Liu
Jiaxiang Tang
Junfeng Ni
Diwen Wan
Gang Zeng
Siyuan Huang
DiffM
VGen
51
3
0
15 Mar 2025
SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering
Byeongjun Park
Hyojun Go
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
VGen
LLMSV
49
1
0
15 Mar 2025
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
Jianhong Bai
Menghan Xia
Xiao Fu
Xintao Wang
Lianrui Mu
...
Zuozhu Liu
Haoji Hu
Xiang Bai
Pengfei Wan
Di Zhang
DiffM
VGen
50
3
0
14 Mar 2025
Cross-Modal Learning for Music-to-Music-Video Description Generation
Zhuoyuan Mao
Mengjie Zhao
Qiyu Wu
Zhi-Wei Zhong
Wei-Hsiang Liao
Hiromi Wakaki
Yuki Mitsufuji
DiffM
VGen
82
0
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yuqing Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
60
1
0
14 Mar 2025
Learning Few-Step Diffusion Models by Trajectory Distribution Matching
Yihong Luo
Tianyang Hu
Jiacheng Sun
Yujun Cai
Jing Tang
DiffM
88
1
0
13 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
53
0
0
13 Mar 2025
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv
Chenyang Si
Junhao Song
Zhenyu Yang
Ping Luo
Ziwei Liu
Kwan-Yee K. Wong
VGen
DiffM
84
8
0
13 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
Jiadong Wang
Ziwei Liu
Koike Hideki
VGen
61
0
0
13 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
51
0
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yanjie Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
72
0
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
105
5
0
13 Mar 2025
VideoMerge: Towards Training-free Long Video Generation
Siyang Zhang
Harry Yang
Ser-Nam Lim
DiffM
VGen
50
0
0
13 Mar 2025
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Yuanxin Liu
Rui Zhu
Shuhuai Ren
Jiacong Wang
Haoyuan Guo
Xu Sun
Lu Jiang
142
1
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
76
3
0
13 Mar 2025
Accelerating Diffusion Sampling via Exploiting Local Transition Coherence
Shangwen Zhu
Han Zhang
Zhantao Yang
Qianyu Peng
Zhao Pu
Haoran Wang
Fan Cheng
DiffM
48
0
0
12 Mar 2025
WonderVerse: Extendable 3D Scene Generation with Video Generative Models
Hao Feng
Zhi Zuo
Jia-Hui Pan
Ka-Hei Hui
Yihua Shao
Qi Dou
Wei Xie
Zhengzhe Liu
VGen
55
1
0
12 Mar 2025
PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop
Chenyu Li
Oscar Michel
Xichen Pan
Sainan Liu
Mike Roberts
Saining Xie
VGen
55
3
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
175
0
0
12 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
67
1
0
12 Mar 2025
Unified Dense Prediction of Video Diffusion
Lehan Yang
Lu Qi
Xianrui Li
Sheng Li
Varun Jampani
Ming Yang
MDE
VOS
VGen
63
0
0
12 Mar 2025
Previous
1
2
3
4
5
6
7
Next