Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06072
Cited By
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
12 August 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
Jiazheng Xu
Yuanming Yang
Wenyi Hong
Xiaohan Zhang
Guanyu Feng
Da Yin
Yuxuan Zhang
Weihan Wang
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer"
50 / 309 papers shown
Title
DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories
Joel Jang
Seonghyeon Ye
Zongyu Lin
Jiannan Xiang
Johan Bjorck
...
Dieter Fox
Jan Kautz
Scott Reed
Yuke Zhu
Linxi Fan
VGen
OffRL
AI4TS
4
0
0
19 May 2025
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation
Haiquan Wen
Yiwei He
Zhenglin Huang
Tianxiao Li
Zihan YU
Xingru Huang
Lu Qi
Baoyuan Wu
Xuelong Li
Guangliang Cheng
VGen
9
0
0
19 May 2025
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffM
VGen
4
0
0
18 May 2025
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong
Xingye Tian
Boyuan Jiang
Xuebo Wang
Xin Tao
Pengfei Wan
Zhiwei Zhang
2
0
0
17 May 2025
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
Jiarui Wang
Huiyu Duan
Ziheng Jia
Yu Zhao
Woo Yi Yang
...
Z. Chen
Juntong Wang
Yuke Xing
Guangtao Zhai
Xiongkuo Min
VGen
2
0
0
17 May 2025
Face Consistency Benchmark for GenAI Video
Michal Podstawski
Malgorzata Kudelska
Haohong Wang
CVBM
EGVM
28
0
0
16 May 2025
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and
O
(
T
)
\mathcal{O}(T)
O
(
T
)
Complexity
Shihao Zou
Qingfeng Li
Wei Ji
Jingjing Li
Yongkui Yang
Guoqi Li
Chao Dong
27
0
0
15 May 2025
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
Yanbo Ding
Xirui Hu
Zhizhi Guo
Yansen Wang
DiffM
VGen
36
0
0
15 May 2025
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model
Wei Li
Ming Hu
Guoan Wang
Lihao Liu
Kaijin Zhou
Junzhi Ning
Xin Guo
Zongyuan Ge
Lixu Gu
Junjun He
28
0
0
12 May 2025
DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models
Junhao Xia
Chaoyang Zhang
Yecheng Zhang
Chengyang Zhou
Zhichang Wang
Bochun Liu
Dongshuo Yin
DiffM
VGen
31
0
0
11 May 2025
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Panwen Hu
Jiehui Huang
Qiang Sun
Xiaodan Liang
DiffM
VGen
28
0
0
11 May 2025
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
45
0
0
08 May 2025
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao Song
Jiahao Zhang
Jiale Zhao
VGen
189
0
0
08 May 2025
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
57
0
0
07 May 2025
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios
Shiyi Zhang
Junhao Zhuang
Zhaoyang Zhang
Ying Shan
Yansong Tang
VGen
107
0
0
06 May 2025
Learning 3D Persistent Embodied World Models
Siyuan Zhou
Yilun Du
Yuncong Yang
Lei Han
Peihao Chen
Dit-Yan Yeung
Chuang Gan
VGen
47
0
0
05 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
51
0
0
05 May 2025
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization
Wenchuan Wang
Mengqi Huang
Yijing Tu
Zhendong Mao
VGen
69
0
0
04 May 2025
Generating Animated Layouts as Structured Text Representations
Yeonsang Shin
Jihwan Kim
Yumin Song
Kyungseung Lee
Hyunhee Chung
Taeyoung Na
DiffM
VGen
70
0
0
02 May 2025
Controllable Weather Synthesis and Removal with Video Diffusion Models
Chih-Hao Lin
Zihan Wang
Ruofan Liang
Yuxuan Zhang
Sanja Fidler
Shenlong Wang
Zan Gojcic
DiffM
VGen
48
0
0
01 May 2025
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao Song
Jiahao Zhang
Jiale Zhao
EGVM
VGen
PINN
82
1
0
01 May 2025
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
65
0
0
30 Apr 2025
Direct Motion Models for Assessing Generated Videos
Kelsey R. Allen
Carl Doersch
Guangyao Zhou
Mohammed Suhail
Danny Driess
...
Thomas Kipf
Mehdi S. M. Sajjadi
Kevin P. Murphy
João Carreira
Sjoerd van Steenkiste
EGVM
DiffM
VGen
78
0
0
30 Apr 2025
AnimateAnywhere: Rouse the Background in Human Image Animation
Xiaoyu Liu
Mingshuai Yao
Y. Zhang
Xianhui Lin
Peiran Ren
X. Li
Ming-Yu Liu
W. Zuo
3DH
DiffM
65
0
0
28 Apr 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
91
0
0
24 Apr 2025
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
Yingjie Xi
J. J. Zhang
Xiaosong Yang
46
0
0
23 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Qi Dai
Jaesik Park
Chong Luo
DiffM
VGen
115
0
0
23 Apr 2025
BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation
Ke Xu
Mingli Zhu
Jiarong Ou
R. J. Chen
Xin Tao
Pengfei Wan
Baoyuan Wu
DiffM
AAML
VGen
53
0
0
23 Apr 2025
Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning
Wang Lin
Liyu Jia
Wentao Hu
Kaihang Pan
Zhongqi Yue
Wei Zhao
Jingyuan Chen
Fei Wu
Hanwang Zhang
VGen
46
1
0
22 Apr 2025
DiTPainter: Efficient Video Inpainting with Diffusion Transformers
Xian Wu
Chang Liu
DiffM
28
0
0
22 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
Xuzhao Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Wenjie Qu
Ji Wan
Jiadong Wang
VGen
67
1
0
22 Apr 2025
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models
Siyuan Liang
Jiayang Liu
Jiecheng Zhai
Tianmeng Fang
Rongcheng Tu
A. Liu
Xiaochun Cao
Dacheng Tao
VGen
61
0
0
22 Apr 2025
Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform
Xianpan Zhou
VGen
63
0
0
21 Apr 2025
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation
Weijie He
Mushui Liu
YunLong Yu
Zhao Wang
Chao Wu
DiffM
VGen
64
0
0
21 Apr 2025
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Chenjie Cao
Jingkai Zhou
Shikai Li
Jingyun Liang
Chaohui Yu
Fan Wang
Xiangyang Xue
Yanwei Fu
DiffM
VGen
68
0
0
21 Apr 2025
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Jingjing Ren
Wenbo Li
Zhongdao Wang
Haoze Sun
Bangzhen Liu
...
Aoxue Li
Shifeng Zhang
Bin Shao
Yong Guo
Lei Zhu
VGen
43
0
0
20 Apr 2025
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation
Minho Park
Taewoong Kang
Jooyeol Yun
Sungwon Hwang
Jaegul Choo
VGen
MDE
29
0
0
19 Apr 2025
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Xiao Jin
Zihan Wang
Zhenhua Yu
Changrak Choi
Kalind Carpenter
T. Nanayakkara
40
0
0
17 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffM
VGen
56
1
0
17 Apr 2025
Understanding Attention Mechanism in Video Diffusion Models
Bingyan Liu
Chengyu Wang
Tongtong Su
Huan Ten
Jun Huang
K. Guo
Kui Jia
VGen
64
0
0
16 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
76
0
0
16 Apr 2025
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation
Linus Jern
Valter Uotila
Cong Yu
Bo Zhao
MQ
LRM
27
0
0
15 Apr 2025
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi
Jiadong Wang
Yuanzhi Liang
Xi Qiu
Yuchi Huo
R. Wang
Chi Zhang
Xuzhao Li
DiffM
VGen
65
0
0
15 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGen
MDE
33
0
0
15 Apr 2025
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
Junke Wang
Zhi Tian
Xinyu Wang
Xinyu Zhang
Weilin Huang
Zuxuan Wu
Yu Jiang
VGen
55
6
0
15 Apr 2025
Analysis of Attention in Video Diffusion Transformers
Yuxin Wen
Jim Wu
Ajay Jain
Tom Goldstein
Ashwinee Panda
53
1
0
14 Apr 2025
Decoupled Diffusion Sparks Adaptive Scene Generation
Yunsong Zhou
Naisheng Ye
William Ljungbergh
Tianyu Li
Jiazhi Yang
Zetong Yang
Hongzi Zhu
Christoffer Petersson
Hongyang Li
44
1
0
14 Apr 2025
Aligning Anime Video Generation with Human Feedback
Bingwen Zhu
Yudong Jiang
Baohan Xu
Siqian Yang
Mingyu Yin
Yidi Wu
Huyang Sun
Zuxuan Wu
EGVM
VGen
55
0
0
14 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Anil Kag
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
39
0
0
14 Apr 2025
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise
Chao Liu
Arash Vahdat
DiffM
VGen
44
0
0
14 Apr 2025
1
2
3
4
5
6
7
Next