ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.13290
  4. Cited By
CogView: Mastering Text-to-Image Generation via Transformers

CogView: Mastering Text-to-Image Generation via Transformers

26 May 2021
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
Da Yin
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
    ViT
    VLM
ArXivPDFHTML

Papers citing "CogView: Mastering Text-to-Image Generation via Transformers"

50 / 540 papers shown
Title
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions
Aashish Anantha Ramakrishnan
Sharon X. Huang
Dongwon Lee
29
5
0
05 Jan 2023
Attribute-Centric Compositional Text-to-Image Generation
Attribute-Centric Compositional Text-to-Image Generation
Yuren Cong
Martin Renqiang Min
Erran L. Li
Bodo Rosenhahn
M. Yang
68
11
0
04 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
523
0
02 Jan 2023
TeViS:Translating Text Synopses to Video Storyboards
TeViS:Translating Text Synopses to Video Storyboards
Xu Gu
Yuchong Sun
Feiyue Ni
Shizhe Chen
Xihua Wang
Ruihua Song
Bohao Li
Xiang Cao
DiffM
25
4
0
31 Dec 2022
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and
  Text-to-Image Diffusion Models
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu
Xintao Wang
Weihao Cheng
Yan-Pei Cao
Ying Shan
Xiaohu Qie
Shenghua Gao
188
161
0
28 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
21
12
0
23 Dec 2022
Optimizing Prompts for Text-to-Image Generation
Optimizing Prompts for Text-to-Image Generation
Y. Hao
Zewen Chi
Li Dong
Furu Wei
55
140
0
19 Dec 2022
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol
Heewoo Jun
Prafulla Dhariwal
Pamela Mishkin
Mark Chen
DiffM
47
587
0
16 Dec 2022
EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level
  Weakly Supervised Instance Segmentation
EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation
Yunhao Ge
Lyne Tchapmi
Brian Nlong Zhao
Laurent Itti
Vibhav Vineet
DiffM
39
5
0
15 Dec 2022
Image Compression with Product Quantized Masked Image Modeling
Image Compression with Product Quantized Masked Image Modeling
Alaaeldin El-Nouby
Matthew Muckley
Karen Ullrich
Ivan Laptev
Jakob Verbeek
Hervé Jégou
MQ
32
31
0
14 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
38
228
0
10 Dec 2022
Diffusion Guided Domain Adaptation of Image Generators
Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song
Ligong Han
Bingchen Liu
Dimitris N. Metaxas
Ahmed Elgammal
DiffM
28
34
0
08 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
  Models
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
37
15
0
08 Dec 2022
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using
  CLIP and StableDiffusion
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
Hanqing Zhao
Dianmo Sheng
Jianmin Bao
Dongdong Chen
Dong Chen
...
Ce Liu
Wenbo Zhou
Qi Chu
Weiming Zhang
Neng H. Yu
VLM
DiffM
38
39
0
07 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image
  Synthesis
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
32
21
0
06 Dec 2022
CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector
  Graphics
CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics
Yiren Song
Xuning Shao
Kang Chen
Weidong Zhang
Minzhe Li
Zhongliang Jing
CLIP
VLM
27
22
0
05 Dec 2022
3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
Zutao Jiang
Guangsong Lu
Xiaodan Liang
Jihua Zhu
Wei Zhang
Xiaojun Chang
Hang Xu
DiffM
21
8
0
02 Dec 2022
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
Jaskirat Singh
Stephen Gould
Liang Zheng
DiffM
41
40
0
30 Nov 2022
CLIP2GAN: Towards Bridging Text with the Latent Space of GANs
CLIP2GAN: Towards Bridging Text with the Latent Space of GANs
Yixuan Wang
Wen-gang Zhou
Jianmin Bao
Weilun Wang
Li Li
Houqiang Li
GAN
CLIP
33
5
0
28 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
40
203
0
25 Nov 2022
Shifted Diffusion for Text-to-image Generation
Shifted Diffusion for Text-to-image Generation
Yufan Zhou
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
27
40
0
24 Nov 2022
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Shweta Mahajan
Leonid Sigal
DiffM
19
68
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via
  Multimodal Masked Video Generation
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-Jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
61
37
0
23 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
22
95
0
22 Nov 2022
SceneComposer: Any-Level Semantic Image Synthesis
SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng
Zhe-nan Lin
Jianming Zhang
Qing Liu
John Collomosse
Jason Kuen
Vishal M. Patel
DiffM
25
49
0
21 Nov 2022
Versatile Diffusion: Text, Images and Variations All in One Diffusion
  Model
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
Xingqian Xu
Zhangyang Wang
Eric Zhang
Kai Wang
Humphrey Shi
DiffM
43
186
0
15 Nov 2022
Extreme Generative Image Compression by Learning Text Embedding from
  Diffusion Models
Extreme Generative Image Compression by Learning Text Embedding from Diffusion Models
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
33
23
0
14 Nov 2022
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image
  Generation
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
20
11
0
14 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis
  in Quantized Latent Spaces
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
19
11
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
35
4
0
13 Nov 2022
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal
  Guidance
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li
Xue Xu
Xinyan Xiao
Jiacheng Liu
Hu Yang
...
Zhanpeng Wang
Zhifan Feng
Qiaoqiao She
Yajuan Lyu
Hua Wu
121
29
0
28 Oct 2022
How well can Text-to-Image Generative Models understand Ethical Natural
  Language Interventions?
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
Hritik Bansal
Da Yin
Masoud Monajatipoor
Kai-Wei Chang
53
99
0
27 Oct 2022
Lafite2: Few-shot Text-to-Image Generation
Lafite2: Few-shot Text-to-Image Generation
Yufan Zhou
Chunyuan Li
Changyou Chen
Jianfeng Gao
Jinhui Xu
DiffM
32
11
0
25 Oct 2022
DiffEdit: Diffusion-based semantic image editing with mask guidance
DiffEdit: Diffusion-based semantic image editing with mask guidance
Guillaume Couairon
Jakob Verbeek
Holger Schwenk
Matthieu Cord
DiffM
80
483
0
20 Oct 2022
OCR-VQGAN: Taming Text-within-Image Generation
OCR-VQGAN: Taming Text-within-Image Generation
Juan A. Rodriguez
David Vazquez
I. Laradji
M. Pedersoli
Pau Rodríguez López
38
19
0
19 Oct 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for
  Text-to-Image Generation
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Rui Li
Weihua Li
Yi Yang
Hanyu Wei
Jianhua Jiang
Quan-wei Bai
DiffM
27
11
0
18 Oct 2022
Character-Centric Story Visualization via Visual Planning and Token
  Alignment
Character-Centric Story Visualization via Visual Planning and Token Alignment
Hong Chen
Rujun Han
Te-Lin Wu
Hideki Nakayama
Nanyun Peng
DiffM
VGen
27
31
0
16 Oct 2022
One Model to Edit Them All: Free-Form Text-Driven Image Manipulation
  with Semantic Modulations
One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations
Yi-Chun Zhu
Hongyu Liu
Yibing Song
Ziyang Yuan
Xintong Han
Chun Yuan
Qifeng Chen
Jue Wang
VLM
DiffM
34
31
0
14 Oct 2022
Plausible May Not Be Faithful: Probing Object Hallucination in
  Vision-Language Pre-training
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Wenliang Dai
Zihan Liu
Ziwei Ji
Dan Su
Pascale Fung
MLLM
VLM
32
63
0
14 Oct 2022
Underspecification in Scene Description-to-Depiction Tasks
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
74
32
0
11 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
43
4
0
11 Oct 2022
HORIZON: High-Resolution Semantically Controlled Panorama Synthesis
HORIZON: High-Resolution Semantically Controlled Panorama Synthesis
Kun Yan
Lei Ji
Chenfei Wu
Jian Liang
Ming Zhou
Nan Duan
Shuai Ma
36
0
0
10 Oct 2022
Bridging CLIP and StyleGAN through Latent Alignment for Image Editing
Bridging CLIP and StyleGAN through Latent Alignment for Image Editing
Wanfeng Zheng
Qiang Li
Xiaoyan Guo
Pengfei Wan
Zhong-ming Wang
75
14
0
10 Oct 2022
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Fernando Moral-Andrés
Elena Merino-Gómez
Pedro Reviriego
Fabrizio Lombardi
27
7
0
07 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
275
1,077
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
62
1,480
0
05 Oct 2022
Progressive Text-to-Image Generation
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
89
4
0
05 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
Vision+X: A Survey on Multimodal Learning in the Light of Data
Ye Zhu
Yuehua Wu
N. Sebe
Yan Yan
35
16
0
05 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
61
81
0
03 Oct 2022
Membership Inference Attacks Against Text-to-image Generation Models
Membership Inference Attacks Against Text-to-image Generation Models
Yixin Wu
Ning Yu
Zheng Li
Michael Backes
Yang Zhang
DiffM
27
65
0
03 Oct 2022
Previous
123...101189
Next