Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.15868
Cited By
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
29 May 2022
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"
50 / 458 papers shown
Title
MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models
Shansong Liu
Atin Sakkeer Hussain
Qilong Wu
Chenshuo Sun
Ying Shan
AuLLM
121
4
0
09 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Ying Shan
DiffM
VGen
186
0
0
05 Dec 2024
Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention
Hannan Lu
Xiaohe Wu
Shudong Wang
Xiameng Qin
Xinyu Zhang
Junyu Han
W. Zuo
Ji Tao
143
2
0
04 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
159
1
0
03 Dec 2024
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang
Bin Zhu
Bin Lin
Mingzhe Zheng
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
VGen
3DH
127
7
0
30 Nov 2024
PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
Qiyao Xue
Xiangyu Yin
Boyuan Yang
Wei Gao
DiffM
VGen
173
12
0
30 Nov 2024
Motion Dreamer: Boundary Conditional Motion Reasoning for Physically Coherent Video Generation
Tianshuo Xu
Zhifei Chen
Leyi Wu
Hao Lu
Yuying Chen
Lihui Jiang
Bingbing Liu
Yingcong Chen
VGen
126
0
0
30 Nov 2024
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
Chaojun Ni
Guosheng Zhao
Xiaofeng Wang
Zheng Hua Zhu
Wenkang Qin
...
Kun Zhan
Peng Jia
Xianpeng Lang
Xingang Wang
Wenjun Mei
VGen
398
11
0
29 Nov 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
135
5
0
29 Nov 2024
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai
Eric Ryan Chan
Yunzhi Zhang
Leonidas Guibas
Jiajun Wu
Gordon Wetzstein
132
13
0
27 Nov 2024
Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop
Zhaofang Qian
Abolfazl Sharifi
Tucker Carroll
Ser-Nam Lim
VGen
143
0
0
26 Nov 2024
StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu
Zhen Xing
Xintong Han
Zhi-Qi Cheng
Qi Dai
Chong Luo
Zuxuan Wu
VGen
218
23
0
26 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
250
1
0
25 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
156
11
0
25 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
230
3
0
22 Nov 2024
Generating 3D-Consistent Videos from Unposed Internet Photos
Gene Chou
Kai Zhang
Sai Bi
Hao Tan
Zexiang Xu
Fujun Luan
Bharath Hariharan
Noah Snavely
3DGS
VGen
164
3
0
20 Nov 2024
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models
Mathis Koroglu
Hugo Caselles-Dupré
Guillaume Jeanneret Sanmiguel
Matthieu Cord
VGen
DiffM
54
2
0
15 Nov 2024
Spider: Any-to-Many Multimodal LLM
Jinxiang Lai
Jie Zhang
Jun Liu
Jian Li
Xiaocheng Lu
Song Guo
MLLM
189
2
0
14 Nov 2024
Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models
Chengdong Dong
Vijayakumar Bhagavatula
Zhenyu Zhou
Ajay Kumar
130
0
0
13 Nov 2024
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Xiaofeng Wang
Kang Zhao
Fan Liu
Jiayu Wang
Guosheng Zhao
Xiaoyi Bao
Zheng Hua Zhu
Yingya Zhang
Xingang Wang
VGen
119
10
0
13 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
183
0
0
12 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
191
14
0
08 Nov 2024
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
David Junhao Zhang
Roni Paiss
Shiran Zada
Nikhil Karnad
David E. Jacobs
Yael Pritch
Inbar Mosseri
Mike Zheng Shou
Neal Wadhwa
Nataniel Ruiz
DiffM
VGen
154
21
0
07 Nov 2024
Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Ao Fu
Yi Zhou
Tao Zhou
Yue Yang
Bojun Gao
Qun Li
Guobin Wu
Ling Shao
VGen
100
3
0
05 Nov 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
107
12
0
28 Oct 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
Jiajian Li
H. Ling
Furu Wei
VGen
DiffM
154
7
0
27 Oct 2024
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction
Z. Gong
Guangyin Bao
Qi Zhang
Zhongwei Wan
Duoqian Miao
...
Changwei Wang
Rongtao Xu
Liang Hu
Ke Liu
Yu Zhang
DiffM
VGen
124
10
0
25 Oct 2024
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Xiang Wang
Haonan Qiu
...
Fan Liu
Zhizhong Huang
Jiaxin Ye
Yingya Zhang
Hongming Shan
DiffM
VGen
113
18
0
17 Oct 2024
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
Guosheng Zhao
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Xinming Zhang
...
Xinze Chen
Boyuan Wang
Youyi Zhang
Wenjun Mei
Xingang Wang
VGen
174
32
0
17 Oct 2024
Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing
Mingce Guo
Jingxuan He
Shengeng Tang
Zhangye Wang
Lechao Cheng
VGen
DiffM
138
0
0
16 Oct 2024
VideoAgent: Self-Improving Video Generation
Achint Soni
Sreyas Venkataraman
Abhranil Chandra
Sebastian Fischmeister
Percy Liang
Bo Dai
Sherry Yang
LM&Ro
VGen
168
11
0
14 Oct 2024
Animating the Past: Reconstruct Trilobite via Video Generation
Xiaoran Wu
Zien Huang
Chonghan Yu
VGen
89
1
0
10 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
168
87
0
08 Oct 2024
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu
Pengyang Ling
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
DiffM
VGen
44
0
0
08 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
163
35
0
03 Oct 2024
COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation
Mingzhen Sun
Weining Wang
Xinxin Zhu
Jing Liu
VGen
DiffM
63
0
0
02 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRL
OnRL
LM&Ro
133
4
0
01 Oct 2024
Replace Anyone in Videos
Xiang Wang
Shiwei Zhang
Haonan Qiu
Ruihang Chu
Zekun Li
Yuanxing Zhang
Changxin Gao
Yuehuan Wang
Chunhua Shen
Nong Sang
VGen
DiffM
123
1
0
30 Sep 2024
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Yi Zhang
Zhen Chen
Chih-Hong Cheng
Wenjie Ruan
Xiaowei Huang
Dezong Zhao
David Flynn
Siddartha Khastgir
Xingyu Zhao
MedIm
97
4
0
26 Sep 2024
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling
Yifang Men
Yuan Yao
Miaomiao Cui
Liefeng Bo
DiffM
138
30
0
24 Sep 2024
TextToon: Real-Time Text Toonify Head Avatar from Single Video
Luchuan Song
Lele Chen
Celong Liu
Pinxin Liu
Chenliang Xu
DiffM
96
9
0
23 Sep 2024
JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation
Hadrien Reynaud
Matthew Baugh
Mischa Dombrowski
Sarah Cechnicka
Qingjie Meng
Bernhard Kainz
VLM
66
0
0
21 Sep 2024
DNI: Dilutional Noise Initialization for Diffusion Video Editing
Sunjae Yoon
Gwanhyeong Koo
Ji Woo Hong
Chang D. Yoo
DiffM
81
2
0
19 Sep 2024
Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future Prospects
Awal Ahmed Fime
Saifuddin Mahmud
Arpita Das
Md. Sunzidul Islam
Hong-Hoon Kim
VGen
3DV
46
1
0
14 Sep 2024
AMG: Avatar Motion Guided Video Generation
Zhangsihao Yang
Mengyi Shan
Mohammad Farazi
Wenhui Zhu
Yanxi Chen
Xuanzhao Dong
Yalin Wang
VGen
DiffM
117
0
0
02 Sep 2024
Compositional 3D-aware Video Generation with LLM Director
Hanxin Zhu
Tianyu He
Anni Tang
Junliang Guo
Zhibo Chen
Jiang Bian
DiffM
VGen
108
7
0
31 Aug 2024
One-Shot Learning Meets Depth Diffusion in Multi-Object Videos
Anisha Jain
VGen
DiffM
MDE
42
1
0
29 Aug 2024
Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Xiaoyu Jin
Zunnan Xu
Mingwen Ou
Wenming Yang
DiffM
89
7
0
29 Aug 2024
EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation
Cong Wang
Jiaxi Gu
Panwen Hu
Haoyu Zhao
Yuanfan Guo
J. N. Han
Hang Xu
Xiaodan Liang
VGen
DiffM
99
7
0
23 Aug 2024
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data
Tao Yang
Yangming Shi
Yunwen Huang
Feng Chen
Yin Zheng
Lei Zhang
DiffM
VGen
87
0
0
19 Aug 2024
Previous
1
2
3
4
5
...
8
9
10
Next