ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.15868
  4. Cited By
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers

CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

29 May 2022
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
    DiffM
ArXiv (abs)PDFHTML

Papers citing "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

50 / 458 papers shown
Title
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model
  Adaptation
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Guy Yariv
Itai Gat
Sagie Benaim
Lior Wolf
Idan Schwartz
Yossi Adi
DiffMVGen
108
41
0
28 Sep 2023
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
David Junhao Zhang
Jay Zhangjie Wu
Jia-Wei Liu
Rui Zhao
L. Ran
Yuchao Gu
Difei Gao
Mike Zheng Shou
DiffMVGen
129
223
0
27 Sep 2023
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion
  Models
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
Yaohui Wang
Xinyuan Chen
Xin Ma
Shangchen Zhou
Ziqi Huang
...
Chen Change Loy
Bo Dai
Dahua Lin
Yu Qiao
Ziwei Liu
VGenDiffM
119
232
0
26 Sep 2023
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided
  Planning
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Han Lin
Abhaysinh Zala
Jaemin Cho
Joey Tianyi Zhou
LM&RoVGenDiffM
148
81
0
26 Sep 2023
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM
  Animator
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
Hanzhuo Huang
Yufan Feng
Cheng Shi
Lan Xu
Jingyi Yu
Sibei Yang
DiffMVGen
100
66
0
25 Sep 2023
GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided
  Video DecodER
GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided Video DecodER
Mingzhen Sun
Weining Wang
Zihan Qin
Jiahui Sun
Si-Qing Chen
Qingbin Liu
DiffM
61
3
0
23 Sep 2023
FreeU: Free Lunch in Diffusion U-Net
FreeU: Free Lunch in Diffusion U-Net
Chenyang Si
Ziqi Huang
Yuming Jiang
Ziwei Liu
DiffM
116
147
0
20 Sep 2023
NExT-GPT: Any-to-Any Multimodal LLM
NExT-GPT: Any-to-Any Multimodal LLM
Shengqiong Wu
Hao Fei
Leigang Qu
Wei Ji
Tat-Seng Chua
MLLM
119
507
0
11 Sep 2023
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Jiaxi Gu
Shicong Wang
Haoyu Zhao
Tianyi Lu
Xing Zhang
Zuxuan Wu
Songcen Xu
Wei Zhang
Yu-Gang Jiang
Hang Xu
DiffMVGen
82
48
0
07 Sep 2023
Enhancing Semantic Communication with Deep Generative Models -- An
  ICASSP Special Session Overview
Enhancing Semantic Communication with Deep Generative Models -- An ICASSP Special Session Overview
Eleonora Grassucci
Yuki Mitsufuji
Ping Zhang
Danilo Comminiello
56
3
0
05 Sep 2023
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks
  and Zero-Curl Regularization
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl Regularization
Xianghui Yang
Guosheng Lin
Zhenghao Chen
Luping Zhou
103
2
0
04 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High
  Definition Text-to-Video Generation
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Qi Zhang
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
137
53
0
01 Sep 2023
Text2Scene: Text-driven Indoor Scene Stylization with Part-aware Details
Text2Scene: Text-driven Indoor Scene Stylization with Part-aware Details
I. Hwang
Hyeonwoo Kim
Y. Kim
DiffM
58
17
0
31 Aug 2023
Explaining Vision and Language through Graphs of Events in Space and
  Time
Explaining Vision and Language through Graphs of Events in Space and Time
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
VLM
96
2
0
29 Aug 2023
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
Hao Fei
Shengqiong Wu
Wei Ji
Hanwang Zhang
Tat-Seng Chua
VGenDiffM
89
34
0
26 Aug 2023
APLA: Additional Perturbation for Latent Noise with Adversarial Training
  Enables Consistency
APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency
Yupu Yao
Shangqi Deng
Zihan Cao
Harry Zhang
Liang-Jian Deng
DiffM
93
14
0
24 Aug 2023
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization
Emanuele Bugliarello
Hernan Moraldo
Ruben Villegas
Mohammad Babaeizadeh
M. Saffar
Han Zhang
D. Erhan
V. Ferrari
Pieter-Jan Kindermans
P. Voigtlaender
VGen
92
11
0
22 Aug 2023
SimDA: Simple Diffusion Adapter for Efficient Video Generation
SimDA: Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing
Qi Dai
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGenDiffM
103
84
0
18 Aug 2023
Dual-Stream Diffusion Net for Text-to-Video Generation
Dual-Stream Diffusion Net for Text-to-Video Generation
Binhui Liu
Xin Liu
Anbo Dai
Zhiyong Zeng
Dan Wang
Zhen Cui
Jian Yang
DiffMVGen
95
10
0
16 Aug 2023
DragNUWA: Fine-grained Control in Video Generation by Integrating Text,
  Image, and Trajectory
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Sheng-Siang Yin
Chenfei Wu
Jian Liang
Jie Shi
Houqiang Li
Gong Ming
Nan Duan
VGen
149
145
0
16 Aug 2023
CoDeF: Content Deformation Fields for Temporally Consistent Video
  Processing
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Ouyang Hao
Qiuyu Wang
Yuxi Xiao
Qingyan Bai
Juntao Zhang
Kecheng Zheng
Xiaowei Zhou
Qifeng Chen
Yujun Shen
DiffMVGen
123
85
0
15 Aug 2023
Story Visualization by Online Text Augmentation with Context Memory
Story Visualization by Online Text Augmentation with Context Memory
Daechul Ahn
Daneul Kim
Gwangmo Song
Seung Wook Kim
Honglak Lee
Dongyeop Kang
Jonghyun Choi
DiffM
63
5
0
15 Aug 2023
ModelScope Text-to-Video Technical Report
ModelScope Text-to-Video Technical Report
Jiuniu Wang
Hangjie Yuan
Dayou Chen
Yingya Zhang
Xiang Wang
Shiwei Zhang
VGenDiffM
126
431
0
12 Aug 2023
Points-to-3D: Bridging the Gap between Sparse Points and
  Shape-Controllable Text-to-3D Generation
Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation
Chaohui Yu
Qiang-feng Zhou
Jingliang Li
Zhe Zhang
Zhibin Wang
Fan Wang
DiffM
94
41
0
26 Jul 2023
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Yin-Yin He
Menghan Xia
Haoxin Chen
Xiaodong Cun
Yuan Gong
...
Yong Zhang
Xintao Wang
Chao-Liang Weng
Ying Shan
Qifeng Chen
DiffMVGen
63
79
0
13 Jul 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models
  without Specific Tuning
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Yuwei Guo
Ceyuan Yang
Anyi Rao
Zhengyang Liang
Yaohui Wang
Yu Qiao
Maneesh Agrawala
Dahua Lin
Bo Dai
VGen
154
883
0
10 Jul 2023
Text-Guided Synthesis of Eulerian Cinemagraphs
Text-Guided Synthesis of Eulerian Cinemagraphs
Aniruddha Mahapatra
Aliaksandr Siarohin
Hsin-Ying Lee
Sergey Tulyakov
Sitong Su
DiffMVGen
94
21
0
06 Jul 2023
MovieFactory: Automatic Movie Creation from Text using Large Generative
  Models for Language and Images
MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images
Sitong Su
Huan Yang
Huiguo He
Wenjing Wang
Zixi Tuo
Wen-Huang Cheng
Lianli Gao
Jingkuan Song
Jianlong Fu
VGenDiffM
90
40
0
12 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
115
30
0
09 Jun 2023
Multi-modal Latent Diffusion
Multi-modal Latent Diffusion
Mustapha Bounoua
Giulio Franzese
Pietro Michiardi
DiffM
98
13
0
07 Jun 2023
Generative Semantic Communication: Diffusion Models Beyond Bit Recovery
Generative Semantic Communication: Diffusion Models Beyond Bit Recovery
Eleonora Grassucci
Sergio Barbarossa
Danilo Comminiello
DiffM
83
58
0
07 Jun 2023
Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions
Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions
Shaoxu Li
DiffM
60
5
0
05 Jun 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
VideoComposer: Compositional Video Synthesis with Motion Controllability
Xiang Wang
Hangjie Yuan
Shiwei Zhang
Dayou Chen
Jiuniu Wang
Yingya Zhang
Yujun Shen
Deli Zhao
Jingren Zhou
VGenDiffM
121
341
0
03 Jun 2023
Probabilistic Adaptation of Text-to-Video Models
Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang
Yilun Du
Bo Dai
Dale Schuurmans
J. Tenenbaum
Pieter Abbeel
VGenDiffM
137
26
0
02 Jun 2023
StyleDrop: Text-to-Image Generation in Any Style
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn
Nataniel Ruiz
Kimin Lee
Daniel Castro Chin
Irina Blok
...
Yuanzhen Li
Yuan Hao
Irfan Essa
Michael Rubinstein
Dilip Krishnan
70
152
0
01 Jun 2023
Make-Your-Video: Customized Video Generation Using Textual and
  Structural Guidance
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing
Menghan Xia
Yuxin Liu
Yuechen Zhang
Yong Zhang
...
Haoxin Chen
Xiaodong Cun
Xintao Wang
Ying Shan
T. Wong
VGenDiffM
85
93
0
01 Jun 2023
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for
  Text-driven Video Editing
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing
Nazmul Karim
Umar Khalid
M. Joneidi
Chen Chen
Nazanin Rahnavard
DiffMVGen
70
5
0
30 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGenDiffM
117
93
0
29 May 2023
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Jia-Bin Huang
Yi Ren
Rongjie Huang
Dongchao Yang
Zhenhui Ye
Chen Zhang
Jinglin Liu
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
120
64
0
29 May 2023
Towards Consistent Video Editing with Text-to-Image Diffusion Models
Towards Consistent Video Editing with Text-to-Image Diffusion Models
Zicheng Zhang
Bonan li
Xuecheng Nie
Congying Han
Tiande Guo
Luoqi Liu
DiffM
64
29
0
27 May 2023
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Ibrahim Ethem Hamamci
Sezgin Er
Anjany Sekuboyina
Enis Simsar
A. Tezcan
...
Hadrien Reynaud
Sarthak Pati
Christian Bluethgen
M. K. Özdemir
Bjoern Menze
DiffMMedIm
123
24
0
25 May 2023
T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified
  Visual Modalities
T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified Visual Modalities
Kangfu Mei
Mo Zhou
Vishal M. Patel
DiffM
87
1
0
24 May 2023
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
122
7
0
24 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot
  Text-to-Video Generation
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffMVGen
106
36
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion
  Models
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
208
11
0
23 May 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGenDiffM
124
254
0
22 May 2023
InstructVid2Vid: Controllable Video Editing with Natural Language
  Instructions
InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
VGenDiffM
80
17
0
21 May 2023
Any-to-Any Generation via Composable Diffusion
Any-to-Any Generation via Composable Diffusion
Zineng Tang
Ziyi Yang
Chenguang Zhu
Michael Zeng
Joey Tianyi Zhou
VGenDiffM
115
191
0
19 May 2023
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang
Huan Yang
Zixi Tuo
Huiguo He
Sitong Su
Jianlong Fu
Jiaying Liu
DiffMVGen
158
117
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yuan Liu
Yogesh Balaji
DiffMVGen
125
263
0
17 May 2023
Previous
123...10789
Next