Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.14217
Cited By
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
28 April 2022
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
38 / 238 papers shown
Title
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents
Weixiong Lin
Ziheng Zhao
Xiaoman Zhang
Chaoyi Wu
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
LM&MA
VLM
MedIm
28
143
0
13 Mar 2023
Scaling up GANs for Text-to-Image Synthesis
Minguk Kang
Jun-Yan Zhu
Richard Y. Zhang
Jaesik Park
Eli Shechtman
Sylvain Paris
Taesung Park
40
441
0
09 Mar 2023
Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu
Yuechen Zhang
Wenbo Li
Zhe-nan Lin
Jiaya Jia
DiffM
VGen
147
202
0
08 Mar 2023
Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding
Jiacheng Li
Longhui Wei
Zongyuan Zhan
Xinfu He
Siliang Tang
Qi Tian
Yueting Zhuang
24
4
0
07 Mar 2023
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation
Yuxiang Wei
Yabo Zhang
Zhilong Ji
Jinfeng Bai
Lei Zhang
W. Zuo
DiffM
28
313
0
27 Feb 2023
MetaAID 2.0: An Extensible Framework for Developing Metaverse Applications via Human-controllable Pre-trained Models
Hongyin Zhu
25
6
0
25 Feb 2023
Auditing Gender Presentation Differences in Text-to-Image Models
Yanzhe Zhang
Lu Jiang
Greg Turk
Diyi Yang
EGVM
24
23
0
07 Feb 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
31
117
0
31 Jan 2023
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao
Bingkun Bao
Hao Tang
Changsheng Xu
DiffM
VLM
68
101
0
30 Jan 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
145
317
0
30 Jan 2023
Towards Arbitrary Text-driven Image Manipulation via Space Alignment
Yun-Hao Bai
Zi-Qi Zhong
Chao Dong
Weichen Zhang
Guowei Xu
Chun Yuan
40
0
0
25 Jan 2023
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
W. Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
35
691
0
22 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
45
66
0
20 Dec 2022
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang
Chitwan Saharia
Ceslee Montgomery
Jordi Pont-Tuset
Shai Noy
...
Radu Soricut
Jason Baldridge
Mohammad Norouzi
Peter Anderson
William Chan
35
173
0
13 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Qing Guo
William Yang Wang
CoGe
45
299
0
09 Dec 2022
CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics
Yiren Song
Xuning Shao
Kang Chen
Weidong Zhang
Minzhe Li
Zhongliang Jing
CLIP
VLM
27
22
0
05 Dec 2022
Breaking the Spurious Causality of Conditional Generation via Fairness Intervention with Corrective Sampling
J. Nam
Sangwoo Mo
Jaeho Lee
Jinwoo Shin
29
7
0
05 Dec 2022
3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
Zutao Jiang
Guangsong Lu
Xiaodan Liang
Jihua Zhu
Wei Zhang
Xiaojun Chang
Hang Xu
DiffM
21
8
0
02 Dec 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
37
203
0
25 Nov 2022
Shifted Diffusion for Text-to-image Generation
Yufan Zhou
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
24
40
0
24 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
56
140
0
23 Nov 2022
Inversion-Based Style Transfer with Diffusion Models
Yu-xin Zhang
Nisha Huang
Fan Tang
Haibin Huang
Chongyang Ma
Weiming Dong
Changsheng Xu
DiffM
30
254
0
23 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
16
95
0
22 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
39
372
0
20 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
19
11
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
27
4
0
13 Nov 2022
clip2latent: Text driven sampling of a pre-trained StyleGAN using denoising diffusion and CLIP
Justin N. M. Pinkney
Chuan Li
CLIP
VLM
52
20
0
05 Oct 2022
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
89
4
0
05 Oct 2022
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Han-Hung Lee
Angel X. Chang
24
63
0
30 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffM
VGen
39
1,351
0
29 Sep 2022
Text-Free Learning of a Natural Language Interface for Pretrained Face Generators
Xiaodan Du
Raymond A. Yeh
Nicholas I. Kolkin
Eli Shechtman
Gregory Shakhnarovich
CLIP
29
1
0
08 Sep 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
29
2,710
0
25 Aug 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
107
1,062
0
22 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
254
566
0
29 May 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
29
48
0
27 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Previous
1
2
3
4
5