Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.15868
Cited By
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
29 May 2022
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"
50 / 458 papers shown
Title
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin
Hagai Taitelbaum
Yonatan Bitton
Brian Gordon
Michal Sokolik
Nitzan Bitton-Guetta
Almog Gueta
Royi Rassin
Itay Laish
Dani Lischinski
EGVM
VGen
110
0
0
24 Apr 2025
Symbolic Representation for Any-to-Any Generative Tasks
Jianfei Chen
Xiaoye Zhu
Yanjie Wang
Tianyang Liu
Xinhui Chen
...
Yifei Ke
Qingbin Liu
Yiwen Yuan
Julian McAuley
Li Li
DiffM
78
0
0
24 Apr 2025
Latent Diffusion Planning for Imitation Learning
Amber Xie
Oleh Rybkin
Dorsa Sadigh
Chelsea Finn
106
1
0
23 Apr 2025
Subject-driven Video Generation via Disentangled Identity and Motion
Daneul Kim
Jingxu Zhang
W. Jin
Sunghyun Cho
Qi Dai
Jaesik Park
Chong Luo
DiffM
VGen
168
2
0
23 Apr 2025
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Ying Li
Xiaobao Wei
Xiaowei Chi
Yiming Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
VGen
85
0
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
Xuzhao Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Yanzhe Zhang
Ji Wan
Jiadong Wang
VGen
128
2
0
22 Apr 2025
Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis
Jingjing Ren
Wenbo Li
Zhongdao Wang
Haoze Sun
Bangzhen Liu
...
Aoxue Li
Shifeng Zhang
Bin Shao
Yong Guo
Lei Zhu
VGen
108
0
0
20 Apr 2025
Understanding Attention Mechanism in Video Diffusion Models
Bingyan Liu
Chengyu Wang
Tongtong Su
Huan Ten
Jun Huang
K. Guo
Kui Jia
VGen
105
1
0
16 Apr 2025
Taming Consistency Distillation for Accelerated Human Image Animation
Xinyu Wang
Shiwei Zhang
Hangjie Yuan
Yujie Wei
Yuanxing Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
83
1
0
15 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGen
MDE
159
0
0
15 Apr 2025
OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding
Dianbing Xi
Jiadong Wang
Yuanzhi Liang
Xi Qiu
Yuchi Huo
Ruiqi Wang
Fangqiu Yi
Xuzhao Li
DiffM
VGen
122
1
0
15 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Fangyin Wei
VGen
MDE
125
1
0
15 Apr 2025
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
Yushu Wu
Yanyu Li
Ivan Skorokhodov
Anil Kag
Willi Menapace
Sharath Girish
Aliaksandr Siarohin
Yanzhi Wang
Sergey Tulyakov
DiffM
VGen
99
0
0
14 Apr 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Ziyi Wang
Xiaodong Yu
Jialian Wu
Xingwu Sun
Yusheng Su
Alan Yuille
Zicheng Liu
Emad Barsoum
DiffM
VGen
72
0
0
13 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
74
0
0
10 Apr 2025
Cellular Development Follows the Path of Minimum Action
Rohola Zandie
Farhan Khodaee
Yufan Xia
Elazer R. Edelman
93
0
0
10 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
184
13
0
07 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
You Li
Jing Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
214
0
0
07 Apr 2025
Gaussian Mixture Flow Matching Models
Hansheng Chen
Kai Zhang
Hao Tan
Zexiang Xu
Fujun Luan
Leonidas Guibas
Gordon Wetzstein
Sai Bi
DiffM
147
2
0
07 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao Song
Jiahao Zhang
ALM
VGen
208
6
0
05 Apr 2025
Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
Xin Jin
Simon Niklaus
Zhoutong Zhang
Zhihao Xia
Chunle Guo
Yuting Yang
J. Chen
Chongyi Li
VGen
137
0
0
04 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
Didong Li
Di Qiu
Jiadong Wang
Yikun Dou
...
Jinfeng Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
135
10
0
03 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Yize Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
120
1
0
01 Apr 2025
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan
Hong-Xing Yu
Sirui Chen
L. Fei-Fei
Jiajun Wu
VGen
142
8
0
01 Apr 2025
JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation
Fangda Chen
Shanshan Zhao
Chuanfu Xu
Long Lan
VGen
91
2
0
31 Mar 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
Boyuan Wang
Xiaofeng Wang
Chaojun Ni
Guosheng Zhao
Zhiqin Yang
...
Yukun Zhou
Xinze Chen
Guan Huang
Lihong Liu
Xingang Wang
VGen
111
3
0
31 Mar 2025
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu
Hongbo Fu
Xintao Wang
Weicai Ye
Pengfei Wan
Di Zhang
Lin Gao
DiffM
VGen
138
0
0
30 Mar 2025
Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion
Jangho Park
Taesung Kwon
Jong Chul Ye
VGen
105
1
0
28 Mar 2025
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
Chi-Pin Huang
Yen-Siang Wu
Hung-Kai Chung
Kai-Po Chang
Fu-En Yang
Yu-Jie Wang
DiffM
VGen
102
1
0
27 Mar 2025
Controlling Large Language Model with Latent Actions
Chengxing Jia
Ziniu Li
Pengyuan Wang
Yi-Chen Li
Zhenyu Hou
Yuxiao Dong
Y. Yu
119
1
0
27 Mar 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Yize Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
130
14
0
27 Mar 2025
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
157
1
0
26 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
93
3
0
26 Mar 2025
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
T. Liu
Z. Huang
Zhaoxi Chen
Guangcong Wang
Shoukang Hu
Liao Shen
Huiqiang Sun
Z. Cao
Wei Li
Ziwei Liu
VGen
3DGS
129
1
0
26 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
176
11
0
25 Mar 2025
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
Guosheng Zhao
Xiaofeng Wang
Chaojun Ni
Zheng Zhu
Wenkang Qin
Guan Huang
Xingang Wang
127
2
0
24 Mar 2025
Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance
Sicong Feng
Jielong Yang
Li Peng
DiffM
VGen
68
0
0
24 Mar 2025
Video-T1: Test-Time Scaling for Video Generation
Fan Liu
Hanyang Wang
Yimo Cai
Kaiyan Zhang
Xiaohang Zhan
Yueqi Duan
DiffM
VGen
160
7
0
24 Mar 2025
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
Takashi Isobe
He Cui
Dong Zhou
Mengmeng Ge
D. Li
E. Barsoum
VGen
93
1
0
24 Mar 2025
Aether: Geometric-Aware Unified World Modeling
Aether Team
Haoyi Zhu
Yanjie Wang
Jianjun Zhou
Wenzheng Chang
...
Zizun Li
Junyi Chen
Chunhua Shen
Jiangmiao Pang
Tong He
DiffM
VGen
124
9
0
24 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
126
2
0
23 Mar 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Oucheng Huang
Yuhang Ma
Zeng Zhao
Mingrui Wu
Jiayi Ji
Rongsheng Zhang
Zhibo Hu
Xiaoshuai Sun
Rongrong Ji
78
1
0
22 Mar 2025
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan
Ting Zhang
Ying Deng
Jiapei Zhang
Yeshuang Zhu
Zexi Jia
Jie Zhou
Jinchao Zhang
VGen
72
0
0
22 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Yizhou Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
93
0
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
96
1
0
21 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
125
2
0
20 Mar 2025
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
Longbin Ji
Lei Zhong
Pengfei Wei
Changjian Li
DiffM
VGen
87
0
0
20 Mar 2025
VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling
Hyojun Go
Byeongjun Park
Hyelin Nam
Byung-Hoon Kim
Hyungjin Chung
Changick Kim
3DGS
VGen
158
1
0
20 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
118
2
0
20 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
131
3
0
19 Mar 2025
Previous
1
2
3
4
5
...
8
9
10
Next