Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.13290
Cited By
CogView: Mastering Text-to-Image Generation via Transformers
26 May 2021
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
Da Yin
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogView: Mastering Text-to-Image Generation via Transformers"
40 / 540 papers shown
Title
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
28
157
0
01 Mar 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao Wang
Wei Liu
Qian He
Xin-ru Wu
Zili Yi
CLIP
VLM
204
73
0
01 Mar 2022
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
39
272
0
11 Feb 2022
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
Minheng Ni
Chenfei Wu
Haoyang Huang
Daxin Jiang
W. Zuo
Nan Duan
33
19
0
10 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
74
850
0
07 Feb 2022
Music2Video: Automatic Generation of Music Video with fusion of audio and text
Yoonjeon Kim
Joel Jang
Sumin Shin
DiffM
VGen
38
7
0
11 Jan 2022
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
27
58
0
31 Dec 2021
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
39
2
0
28 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
36
48
0
27 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
152
14,738
0
20 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
26
28
0
16 Dec 2021
Emojich -- zero-shot emoji generation using Russian language: a technical report
Alex Shonenkov
Daria Bakshandaeva
Denis Dimitrov
Aleks D. Nikolich
VLM
32
5
0
04 Dec 2021
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Xingchao Liu
Chengyue Gong
Lemeng Wu
Shujian Zhang
Haoran Su
Qiang Liu
CLIP
35
88
0
02 Dec 2021
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation
Tan M. Dinh
Rang Nguyen
Binh-Son Hua
EGVM
33
15
0
02 Dec 2021
Exploration into Translation-Equivariant Image Quantization
W. Shin
Gyubok Lee
Jiyoung Lee
Eun-Young Lyou
Joonseok Lee
Edward Choi
41
7
0
01 Dec 2021
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
74
765
0
29 Nov 2021
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami
Dani Lischinski
Ohad Fried
DiffM
55
921
0
29 Nov 2021
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Chris Tensmeyer
Tong Yu
Jiuxiang Gu
Jinhui Xu
Tong Sun
VLM
35
163
0
27 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViT
VGen
18
293
0
24 Nov 2021
L-Verse: Bidirectional Generation Between Image and Text
Taehoon Kim
Gwangmo Song
Sihaeng Lee
Sangyun Kim
Yewon Seo
Soonyoung Lee
S. Kim
Honglak Lee
Kyunghoon Bae
28
25
0
22 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
88
1,758
0
18 Nov 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
30
619
0
09 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
130
1,721
0
26 Oct 2021
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
19
57
0
19 Oct 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
33
74
0
18 Oct 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Multimodal Dialogue Response Generation
Qingfeng Sun
Yujing Wang
Can Xu
Kai Zheng
Yaming Yang
Huang Hu
Fei Xu
Jessica Zhang
Xiubo Geng
Daxin Jiang
26
43
0
16 Oct 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
92
43
0
08 Oct 2021
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
14
149
0
09 Sep 2021
What Users Want? WARHOL: A Generative Model for Recommendation
Jules Samaran
Ugo Tanielian
Romain Beaumont
Flavian Vasile
HAI
20
0
0
02 Sep 2021
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?
Yuki Tatsunami
Masato Taki
42
12
0
09 Aug 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
816
0
14 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
120
209
0
26 Apr 2021
Creativity and Machine Learning: A Survey
Giorgio Franceschelli
Mirco Musolesi
VLM
AI4CE
34
40
0
06 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,805
0
24 Feb 2021
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
23
2,132
0
23 Dec 2020
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis
Ming Tao
Hao Tang
Fei Wu
Xiaoyuan Jing
Bingkun Bao
Changsheng Xu
38
209
0
13 Aug 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,023
0
28 Jul 2020
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
272
2,552
0
25 Jan 2016
Previous
1
2
3
...
10
11
9