Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.06072
Cited By
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
12 August 2024
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
Jiazheng Xu
Yuanming Yang
Wenyi Hong
Xiaohan Zhang
Guanyu Feng
Da Yin
Yuxuan Zhang
Weihan Wang
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer"
50 / 309 papers shown
Title
Aligning Anime Video Generation with Human Feedback
Bingwen Zhu
Yudong Jiang
Baohan Xu
Siqian Yang
Mingyu Yin
Yidi Wu
Huyang Sun
Zuxuan Wu
EGVM
VGen
52
0
0
14 Apr 2025
D
2
^2
2
iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia
Mengqi Huang
Nan Chen
Lei Zhang
Zhendong Mao
29
0
0
13 Apr 2025
Discriminator-Free Direct Preference Optimization for Video Diffusion
Haoran Cheng
Qide Dong
Liang Peng
Zhizhou Sha
Weiguo Feng
Jinghui Xie
Zhao Song
Shilei Wen
Xiaofei He
Boxi Wu
VGen
125
0
0
11 Apr 2025
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Jialu Li
Shoubin Yu
Han Lin
Jaemin Cho
Jaehong Yoon
Joey Tianyi Zhou
DiffM
VGen
52
0
0
11 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
44
0
0
11 Apr 2025
Generative AI for Film Creation: A Survey of Recent Advances
Ruihan Zhang
Borou Yu
Jiajian Min
Yetong Xin
Zheng Wei
...
Sijia Jiang
Peiwen Huang
Na Chen
Xuanxuan Liu
Anyi Rao
VGen
67
0
0
11 Apr 2025
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Guangcong Zheng
Teng Li
Xianpan Zhou
Xi Li
VGen
3DV
69
1
0
11 Apr 2025
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Junliang Guo
Yang Ye
Tianyu He
Haoyu Wu
Yushu Jiang
Tim Pearce
Jiang Bian
VGen
SyDa
56
2
0
11 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
37
0
0
10 Apr 2025
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo
Matthew Wallingford
Ali Farhadi
Noah Snavely
Wei-Chiu Ma
VGen
31
0
0
10 Apr 2025
SIGMAN:Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang
Fengqi Liu
Yixing Lu
Qin Zhao
Pingyu Wu
...
Ran Yi
Yang Cao
Lizhuang Ma
Zheng-jun Zha
Junting Dong
3DGS
47
0
0
09 Apr 2025
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou
Wenqi Xian
Guandao Yang
Mohamed Abdelfattah
Bharath Hariharan
Noah Snavely
Ning Yu
P. Debevec
MDE
34
0
0
09 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
26
0
0
09 Apr 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
39
0
0
09 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
24
0
0
08 Apr 2025
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Mengchao Wang
Qiang Wang
Fan Jiang
Yaqi Fan
Yunpeng Zhang
Yonggang Qi
Kun Zhao
Mu Xu
DiffM
VGen
36
0
0
07 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
91
4
0
07 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
Jianwei Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
73
0
0
07 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Yongqian Li
Yanhong Zeng
Yuwei Guo
Dahua Lin
Tianfan Xue
Bo Dai
VGen
24
0
0
05 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
96
2
0
05 Apr 2025
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
Wulin Xie
Yuyao Zhang
Chaoyou Fu
Yang Shi
Bingyan Nie
Hongkai Chen
Z. Zhang
Liang Wang
Tieniu Tan
36
1
0
04 Apr 2025
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration
Boyuan Wang
Runqi Ouyang
Xiaofeng Wang
Zheng Zhu
Guosheng Zhao
Chaojun Ni
Guan Huang
Lihong Liu
Xingang Wang
3DGS
74
0
0
04 Apr 2025
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Xin Jin
Simon Niklaus
Zhoutong Zhang
Zhihao Xia
Chunle Guo
Yuting Yang
J. Chen
Chongyi Li
VGen
49
0
0
04 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
Jiadong Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
71
2
0
03 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM
3DGS
VGen
73
0
0
03 Apr 2025
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments
Chenyu Zhang
Daniil Cherniavskii
Andrii Zadaianchuk
Antonios Tragoudaras
Antonios Vozikis
Thijmen Nijdam
Derck W. E. Prinzhorn
Mark Bodracska
N. Sebe
E. Gavves
EGVM
VGen
54
0
0
03 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
50
0
0
02 Apr 2025
Articulated Kinematics Distillation from Video Diffusion Models
Xuan Li
Qianli Ma
Nayeon Lee
Yongxin Chen
Chenfanfu Jiang
Xuan Li
Donglai Xiang
VGen
38
0
0
01 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
65
3
0
01 Apr 2025
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng
Yuying Ge
Yixiao Ge
Jing Liao
Ying Shan
VGen
AI4CE
58
0
0
01 Apr 2025
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
Kun Liu
Qi Liu
Xinchen Liu
Jie Li
Yongdong Zhang
Jiebo Luo
Xiaodong He
Wu Liu
VGen
37
0
0
31 Mar 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
59
2
0
31 Mar 2025
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Abhiram Maddukuri
Z. L. Jiang
L. Chen
Soroush Nasiriany
Yuqi Xie
...
Scott Reed
Ken Goldberg
Ajay Mandlekar
Linxi Fan
Yuke Zhu
59
6
0
31 Mar 2025
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
Xindi Yang
Baolu Li
Wenjie Qu
Zhenfei Yin
Lei Bai
...
Zhiyong Wang
Jianfei Cai
Tien-Tsin Wong
Huchuan Lu
Xu Jia
DiffM
VGen
51
0
0
30 Mar 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation
Yuhang Yang
Ke Fan
Siyang Song
Hongxiang Li
Ailing Zeng
FeiLin Han
Wei-dong Zhai
Wei Liu
Yang Cao
Zheng-jun Zha
EGVM
VGen
78
0
0
30 Mar 2025
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu
Hongbo Fu
Xintao Wang
Weicai Ye
Pengfei Wan
Di Zhang
Lin Gao
DiffM
VGen
45
0
0
30 Mar 2025
MoCha: Towards Movie-Grade Talking Character Synthesis
Cong Wei
Bo Sun
Haoyu Ma
Ji Hou
F. Xu
...
Kunpeng Li
Tingbo Hou
Animesh Sinha
Peter Vajda
Wenhu Chen
VGen
140
0
0
30 Mar 2025
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
H. Zhang
R. Su
Zhihang Yuan
Pengtao Chen
Mingzhu Shen Yibo Fan
Shengen Yan
Guohao Dai
Yu Wang
39
0
0
28 Mar 2025
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
Yishen Ji
Ziyue Zhu
Zhenxin Zhu
Kaixin Xiong
Ming Lu
Zhiqi Li
Lijun Zhou
Haiyang Sun
Bing Wang
Tong Lu
VGen
55
1
0
28 Mar 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Yue Wang
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
87
3
0
27 Mar 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Yuyao Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
56
6
0
27 Mar 2025
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
Haoyu Zhao
Zhongang Qi
Cong Wang
Qingping Zheng
Guansong Lu
Fei Chen
Hang Xu
Zuxuan Wu
DiffM
VGen
48
0
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
72
0
0
27 Mar 2025
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng
Ruiliang Lyu
Xiaotao Gu
Xiao-Chang Liu
Jiazheng Xu
...
Zhuoyi Yang
Yuxiao Dong
Jie Tang
David W. Romero
Minlie Huang
VGen
89
0
0
26 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
47
2
0
26 Mar 2025
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
90
0
0
26 Mar 2025
Mask
2
^2
2
DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi
Jianlong Yuan
Wanquan Feng
Shancheng Fang
Jiawei Liu
Siyu Zhou
Qian He
Hongtao Xie
Yongdong Zhang
DiffM
VGen
44
0
0
25 Mar 2025
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Haiyu Zhang
Xinyuan Chen
Yaohui Wang
Xihui Liu
Yunhong Wang
Yu Qiao
VGen
67
0
0
25 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
Jiadong Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
44
0
0
25 Mar 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
39
0
0
25 Mar 2025
Previous
1
2
3
4
5
6
7
Next