ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.15868
  4. Cited By
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers

CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

29 May 2022
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
    DiffM
ArXiv (abs)PDFHTML

Papers citing "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

50 / 458 papers shown
Title
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao
Enze Xie
Lanqing Hong
Zhenguo Li
G. Lee
DiffMVGen
102
34
0
15 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style
  Transfer
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
Nisha Huang
Yuxin Zhang
Weiming Dong
DiffMVGen
66
17
0
09 May 2023
Visual Transformation Telling
Visual Transformation Telling
Wanqing Cui
Mustafa Nasir-Moin
Yanyan Lan
Viola J. Chen
Jiafeng Guo
Xueqi Cheng
LRM
113
1
0
03 May 2023
Long-Term Rhythmic Video Soundtracker
Long-Term Rhythmic Video Soundtracker
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
105
13
0
02 May 2023
A Portrait of Emotion: Empowering Self-Expression through AI-Generated
  Art
A Portrait of Emotion: Empowering Self-Expression through AI-Generated Art
Y. Lee
Yongha Park
S. Hahn
81
5
0
26 Apr 2023
Align your Latents: High-Resolution Video Synthesis with Latent
  Diffusion Models
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGSVGen
246
1,106
0
18 Apr 2023
Generative Disco: Text-to-Video Generation for Music Visualization
Generative Disco: Text-to-Video Generation for Music Visualization
Vivian Liu
Tao Long
Nathan Raw
Lydia B. Chilton
VGen
66
34
0
17 Apr 2023
Text2Performer: Text-Driven Human Video Generation
Text2Performer: Text-Driven Human Video Generation
Yuming Jiang
Shuai Yang
Tong Liang Koh
Wayne Wu
Chen Change Loy
Ziwei Liu
DiffMVGen
98
52
0
17 Apr 2023
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient
  Text-to-Video Generation
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Jie An
Songyang Zhang
Harry Yang
Sonal Gupta
Jia-Bin Huang
Jiebo Luo
Xiaoyue Yin
DiffMVGen
114
114
0
17 Apr 2023
Video Generation Beyond a Single Clip
Video Generation Beyond a Single Clip
Hsin-Ping Huang
Yu-Chuan Su
Ming-Hsuan Yang
VLMDiffMVGen
93
3
0
15 Apr 2023
MoStGAN-V: Video Generation with Temporal Motion Styles
MoStGAN-V: Video Generation with Temporal Motion Styles
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VGen
75
32
0
05 Apr 2023
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free
  Videos
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
Yue Ma
Yin-Yin He
Xiaodong Cun
Xintao Wang
Siran Chen
Ying Shan
Xiu Li
Qifeng Chen
DiffMVGen
106
195
0
03 Apr 2023
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking
  Styles
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles
Yifeng Ma
Suzhe Wang
Yu-qiong Ding
Lincheng Li
Bowen Ma
Tangjie Lv
Changjie Fan
Zhipeng Hu
Zhidong Deng
Xin Yu
CLIP
87
22
0
01 Apr 2023
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Wen Wang
Yan Jiang
K. Xie
Zide Liu
Hao Chen
Yue Cao
Xinlong Wang
Chunhua Shen
DiffMVGen
110
116
0
30 Mar 2023
DDP: Diffusion Model for Dense Visual Prediction
DDP: Diffusion Model for Dense Visual Prediction
Yuanfeng Ji
Zhe Chen
Enze Xie
Lanqing Hong
Xihui Liu
Zhaoqiang Liu
Tong Lu
Zhenguo Li
Ping Luo
DiffMVLM
133
138
0
30 Mar 2023
Sounding Video Generator: A Unified Framework for Text-guided Sounding
  Video Generation
Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation
Jiawei Liu
Weining Wang
Sihan Chen
Xinxin Zhu
Qingbin Liu
DiffMVGen
84
14
0
29 Mar 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
  Textual Guidance
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma
Xiaioqing Zhang
Xiaoshuai Sun
Jiayi Ji
Haowei Wang
Guannan Jiang
Weilin Zhuang
Rongrong Ji
98
40
0
28 Mar 2023
Fine-grained Audible Video Description
Fine-grained Audible Video Description
Xuyang Shen
Dong Li
Jinxing Zhou
Zhen Qin
Bowen He
...
Yuchao Dai
Lingpeng Kong
Meng Wang
Yu Qiao
Yiran Zhong
VGen
92
11
0
27 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffMVGen
64
43
0
27 Mar 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu
Hao Zhu
Liming Jiang
Chen Change Loy
Weidong (Tom) Cai
Wayne Wu
77
62
0
26 Mar 2023
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni
Changhao Shi
Kaican Li
Sharon X. Huang
Martin Renqiang Min
VGenDiffM
90
178
0
24 Mar 2023
Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion
  Models
Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models
Willi Menapace
Aliaksandr Siarohin
Stéphane Lathuilière
Panos Achlioptas
Vladislav Golyanik
Sergey Tulyakov
Elisa Ricci
LM&RoVGenDiffM
105
16
0
23 Mar 2023
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
  Generators
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan
A. Movsisyan
Vahram Tadevosyan
Roberto Henschel
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
VGen
106
581
0
23 Mar 2023
Pix2Video: Video Editing using Image Diffusion
Pix2Video: Video Editing using Image Diffusion
Duygu Ceylan
C. Huang
Niloy J. Mitra
DiffMVGen
149
262
0
22 Mar 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Sheng-Siang Yin
Chenfei Wu
Huan Yang
Jianfeng Wang
Xiaodong Wang
...
Gong Ming
Lijuan Wang
Zicheng Liu
Houqiang Li
Nan Duan
VGen
83
137
0
22 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
186
170
0
21 Mar 2023
Towards End-to-End Generative Modeling of Long Videos with
  Memory-Efficient Bidirectional Transformers
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
84
3
0
20 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffMVGen
220
322
0
15 Mar 2023
Accountable Textual-Visual Chat Learns to Reject Human Instructions in
  Image Re-creation
Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation
Zhiwei Zhang
Yuliang Liu
MLLM
80
0
0
10 Mar 2023
Video-P2P: Video Editing with Cross-attention Control
Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu
Yuechen Zhang
Wenbo Li
Zhe Lin
Jiaya Jia
DiffMVGen
219
221
0
08 Mar 2023
Neural Vector Fields: Implicit Representation by Explicit Learning
Neural Vector Fields: Implicit Representation by Explicit Learning
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
AI4CE
101
18
0
08 Mar 2023
StraIT: Non-autoregressive Generation with Stratified Image Transformer
StraIT: Non-autoregressive Generation with Stratified Image Transformer
Shengju Qian
Huiwen Chang
Yuanzhen Li
Zizhao Zhang
Jiaya Jia
Han Zhang
114
12
0
01 Mar 2023
AVscript: Accessible Video Editing with Audio-Visual Scripts
AVscript: Accessible Video Editing with Audio-Visual Scripts
Mina Huh
Saelyne Yang
Yi-Hao Peng
Xiang Ánthony' Chen
Young-Ho Kim
Amy Pavel
74
34
0
27 Feb 2023
MetaAID 2.0: An Extensible Framework for Developing Metaverse
  Applications via Human-controllable Pre-trained Models
MetaAID 2.0: An Extensible Framework for Developing Metaverse Applications via Human-controllable Pre-trained Models
Hongyin Zhu
58
6
0
25 Feb 2023
Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be
  Consistent
Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent
Giannis Daras
Y. Dagan
A. Dimakis
C. Daskalakis
DiffM
130
49
0
17 Feb 2023
Structure and Content-Guided Video Synthesis with Diffusion Models
Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
DiffMVGen
191
539
0
06 Feb 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion
  Models
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
238
344
0
30 Jan 2023
MusicLM: Generating Music From Text
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
152
451
0
26 Jan 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for
  Text-to-Video Generation
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
Wynne Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
177
752
0
22 Dec 2022
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol
Heewoo Jun
Prafulla Dhariwal
Pamela Mishkin
Mark Chen
DiffM
141
614
0
16 Dec 2022
Towards Smooth Video Composition
Towards Smooth Video Composition
Qihang Zhang
Ceyuan Yang
Yujun Shen
Yinghao Xu
Bolei Zhou
VGen
87
14
0
14 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffMVGen
121
248
0
10 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image
  Synthesis
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
98
22
0
06 Dec 2022
TPA-Net: Generate A Dataset for Text to Physics-based Animation
TPA-Net: Generate A Dataset for Text to Physics-based Animation
Yuxing Qiu
Feng Gao
Minchen Li
Govind Thattai
Yin Yang
Chenfanfu Jiang
PINNDiffMVGen
58
0
0
25 Nov 2022
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Yin-Yin He
Tianyu Yang
Yong Zhang
Ying Shan
Qifeng Chen
DiffMVGen
114
243
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via
  Multimodal Masked Video Generation
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-Jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
151
38
0
23 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffMVGen
131
390
0
20 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
137
8
0
11 Nov 2022
Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization
Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization
Peter Schaldenbrand
Zhixuan Liu
Jean Oh
CLIP
130
0
0
23 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&RoDiffM
236
149
0
05 Oct 2022
Previous
123...1089
Next