Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts

Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts

21 July 2023

Mayug Maniparambil

Kevin McGuinness

Noel E. O'Connor

Papers citing "Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts"

11 / 11 papers shown

Title
ProAPO: Progressively Automatic Prompt Optimization for Visual Classification Xiangyan Qu Gaopeng Gou Jiamin Zhuang Jing Yu Kun Song Qihao Wang Yili Li Gang Xiong VLM 91 0 0 13 Mar 2025
Tree of Attributes Prompt Learning for Vision-Language Models Tong Ding Wanhua Li Zhongqi Miao Hanspeter Pfister VLM 54 1 0 15 Oct 2024
FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization Zhaopeng Gu Bingke Zhu Guibo Zhu Yingying Chen Hao Li Ming Tang Jinqiao Wang 42 15 0 21 Apr 2024
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning Amir Ziai Aneesh Vartakavi VLM VGen 32 0 0 09 Feb 2024
Zero-Shot Robustification of Zero-Shot Models Dyah Adila Changho Shin Lin Cai Frederic Sala 40 18 0 08 Sep 2023
What does a platypus look like? Generating customized prompts for zero-shot image classification Sarah M Pratt Ian Covert Rosanne Liu Ali Farhadi VLM 131 212 0 07 Sep 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 389 8,495 0 28 Jan 2022
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Andreas Fürst Elisabeth Rumetshofer Johannes Lehner Viet-Hung Tran Fei Tang ... David P. Kreil Michael K Kopp G. Klambauer Angela Bitto-Nemling Sepp Hochreiter VLM CLIP 207 102 0 21 Oct 2021
Learning to Prompt for Vision-Language Models Kaiyang Zhou Jingkang Yang Chen Change Loy Ziwei Liu VPVLM CLIP VLM 348 2,271 0 02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 280 3,848 0 18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 313 3,708 0 11 Feb 2021