ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14598
  4. Cited By
Vision + Language Applications: A Survey

Vision + Language Applications: A Survey

24 May 2023
Yutong Zhou
N. Shimada
    VLM
ArXiv (abs)PDFHTMLGithub (2346★)

Papers citing "Vision + Language Applications: A Survey"

50 / 111 papers shown
Title
Conditional Prompt Learning for Vision-Language Models
Conditional Prompt Learning for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VLMCLIPVPVLM
141
1,359
0
10 Mar 2022
Autoregressive Image Generation using Residual Quantization
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
277
378
0
03 Mar 2022
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
298
30,150
0
01 Mar 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
Zihao Wang
Wei Liu
Qian He
Xin-ru Wu
Zili Yi
CLIPVLM
247
75
0
01 Mar 2022
StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
Peter Schaldenbrand
Zhixuan Liu
Jean Oh
CLIP
91
44
0
24 Feb 2022
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
Minheng Ni
Chenfei Wu
Haoyang Huang
Daxin Jiang
W. Zuo
Nan Duan
57
19
0
10 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLMObjD
157
880
0
07 Feb 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
124
50
0
27 Dec 2021
GLIDE: Towards Photorealistic Image Generation and Editing with
  Text-Guided Diffusion Models
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
364
3,627
0
20 Dec 2021
More Control for Free! Image Synthesis with Semantic Diffusion Guidance
More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu
Dong Huk Park
S. Azadi
Gong Zhang
Arman Chopikyan
Yuxiao Hu
Humphrey Shi
Anna Rohrbach
Trevor Darrell
DiffM
98
256
0
10 Dec 2021
Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Xun Huang
Arun Mallya
Ting-Chun Wang
Xuan Li
DiffM
89
90
0
09 Dec 2021
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN
  Space Optimization
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Xingchao Liu
Chengyue Gong
Lemeng Wu
Shujian Zhang
Haoran Su
Qiang Liu
CLIP
82
91
0
02 Dec 2021
CLIPstyler: Image Style Transfer with a Single Text Condition
CLIPstyler: Image Style Transfer with a Single Text Condition
Gihyun Kwon
Jong Chul Ye
VLMCLIP
86
247
0
01 Dec 2021
Blended Diffusion for Text-driven Editing of Natural Images
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami
Dani Lischinski
Ohad Fried
DiffM
129
954
0
29 Nov 2021
LAFITE: Towards Language-Free Training for Text-to-Image Generation
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Chris Tensmeyer
Tong Yu
Jiuxiang Gu
Jinhui Xu
Tong Sun
VLM
79
168
0
27 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViTVGen
75
296
0
24 Nov 2021
Rhythm is a Dancer: Music-Driven Motion Synthesis with Global Structure
Rhythm is a Dancer: Music-Driven Motion Synthesis with Global Structure
A. Aristidou
Anastasios Yiannakidis
Kfir Aberman
Daniel Cohen-Or
Ariel Shamir
Y. Chrysanthou
82
81
0
23 Nov 2021
Towards Scalable Unpaired Virtual Try-On via Patch-Routed
  Spatially-Adaptive GAN
Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN
Zhenyu Xie
Zaiyu Huang
Fuwei Zhao
Haoye Dong
Michael C. Kampffmeyer
Xiaodan Liang
88
46
0
20 Nov 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLMMLLMCLIP
243
1,444
0
03 Nov 2021
Towards artificial general intelligence via a multimodal foundation
  model
Towards artificial general intelligence via a multimodal foundation model
Nanyi Fei
Zhiwu Lu
Yizhao Gao
Guoxing Yang
Yuqi Huo
...
Ruihua Song
Xin Gao
Tao Xiang
Haoran Sun
Jiling Wen
AI4CELRM
86
226
0
27 Oct 2021
Unifying Multimodal Transformer for Bi-directional Image and Text
  Generation
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
71
59
0
19 Oct 2021
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image
  Manipulation
DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
Gwanghyun Kim
Taesung Kwon
Jong Chul Ye
DiffM
203
656
0
06 Oct 2021
Design Guidelines for Prompt Engineering Text-to-Image Generative Models
Design Guidelines for Prompt Engineering Text-to-Image Generative Models
Vivian Liu
Lydia B. Chilton
65
501
0
14 Sep 2021
Talk-to-Edit: Fine-Grained Facial Editing via Dialog
Talk-to-Edit: Fine-Grained Facial Editing via Dialog
Yuming Jiang
Ziqi Huang
Xingang Pan
Chen Change Loy
Ziwei Liu
DiffM
139
129
0
09 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLMCLIPVLM
507
2,413
0
02 Sep 2021
Cycle-Consistent Inverse GAN for Text-to-Image Synthesis
Cycle-Consistent Inverse GAN for Text-to-Image Synthesis
Hao Wang
Guosheng Lin
Guosheng Lin
Chunyan Miao
77
48
0
03 Aug 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
  in Natural Language Processing
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLMSyDa
234
4,004
0
28 Jul 2021
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image
  Encoders
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders
Kevin Frans
Lisa Soros
Olaf Witkowski
CLIP
89
212
0
28 Jun 2021
Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for
  Generating Representational Gestures from Speech
Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech
Taras Kucherenko
Rajmund Nagy
Patrik Jonell
Michael Neff
Hedvig Kjellström
G. Henter
124
19
0
28 Jun 2021
CogView: Mastering Text-to-Image Generation via Transformers
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViTVLM
125
782
0
26 May 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALMMoEAI4CE
74
213
0
26 Apr 2021
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen-Or
Dani Lischinski
CLIPVLM
129
1,210
0
31 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
418
5,000
0
24 Feb 2021
Adversarial Text-to-Image Synthesis: A Review
Adversarial Text-to-Image Synthesis: A Review
Stanislav Frolov
Tobias Hinz
Federico Raue
Jörn Hees
Andreas Dengel
EGVM
64
177
0
25 Jan 2021
TryOnGAN: Body-Aware Try-On via Layered Interpolation
TryOnGAN: Body-Aware Try-On via Layered Interpolation
Kathleen M. Lewis
Srivatsan Varadharajan
Ira Kemelmacher-Shlizerman
3DH
146
49
0
06 Jan 2021
TiVGAN: Text to Image to Video Generation with Step-by-Step Evolutionary
  Generator
TiVGAN: Text to Image to Video Generation with Step-by-Step Evolutionary Generator
Doyeon Kim
Donggyu Joo
Junmo Kim
GAN
57
48
0
04 Sep 2020
Analyzing and Improving the Image Quality of StyleGAN
Analyzing and Improving the Image Quality of StyleGAN
Tero Karras
S. Laine
M. Aittala
Janne Hellsten
J. Lehtinen
Timo Aila
GAN
321
5,829
0
03 Dec 2019
Semantic Object Accuracy for Generative Text-to-Image Synthesis
Semantic Object Accuracy for Generative Text-to-Image Synthesis
Tobias Hinz
Stefan Heinrich
S. Wermter
EGVM
87
159
0
29 Oct 2019
A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image
  Synthesis
A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis
Jorge Agnese
Jonathan Herrera
Haicheng Tao
Xingquan Zhu
EGVM
76
103
0
21 Oct 2019
Controllable Text-to-Image Generation
Controllable Text-to-Image Generation
Bowen Li
Xiaojuan Qi
Thomas Lukasiewicz
Philip Torr
GAN
86
357
0
16 Sep 2019
FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face
  Generation
FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation
Xiang Chen
Lingbo Qing
Xiaohai He
Xiaodong Luo
Yining Xu
GANCVBM
56
34
0
11 Apr 2019
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image
  Synthesis
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
Minfeng Zhu
Pingbo Pan
Wei Chen
Yi Yang
GAN
57
583
0
02 Apr 2019
MirrorGAN: Learning Text-to-image Generation by Redescription
MirrorGAN: Learning Text-to-image Generation by Redescription
Tingting Qiao
Jing Zhang
Duanqing Xu
Dacheng Tao
VLMGAN
61
542
0
14 Mar 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
619
10,590
0
12 Dec 2018
Tell, Draw, and Repeat: Generating and Modifying Images Based on
  Continual Linguistic Instruction
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction
Alaaeldin El-Nouby
Shikhar Sharma
Hannes Schulz
Devon Hjelm
Layla El Asri
Samira Ebrahimi Kahou
Yoshua Bengio
Graham W.Taylor
VLM
100
123
0
24 Nov 2018
Text-Adaptive Generative Adversarial Networks: Manipulating Images with
  Natural Language
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
Seonghyeon Nam
Yunji Kim
Seon Joo Kim
GAN
79
207
0
29 Oct 2018
Text-to-Image-to-Text Translation using Cycle Consistent Adversarial
  Networks
Text-to-Image-to-Text Translation using Cycle Consistent Adversarial Networks
S. Gorti
J. Ma
GAN
50
28
0
14 Aug 2018
To learn image super-resolution, use a GAN to learn how to do image
  degradation first
To learn image super-resolution, use a GAN to learn how to do image degradation first
Adrian Bulat
J. Yang
Georgios Tzimiropoulos
SupR
65
354
0
30 Jul 2018
MC-GAN: Multi-conditional Generative Adversarial Network for Image
  Synthesis
MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis
Hyojin Park
Y. Yoo
Nojun Kwak
GAN
74
59
0
03 May 2018
Text to Image Synthesis Using Generative Adversarial Networks
Text to Image Synthesis Using Generative Adversarial Networks
Cristian Bodnar
GAN
56
34
0
02 May 2018
Previous
123
Next