Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.07046
Cited By
Exploring Visual Interpretability for Contrastive Language-Image Pre-training
15 September 2022
Yi Li
Hualiang Wang
Yiqun Duan
Han Xu
Xiaomeng Li
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Visual Interpretability for Contrastive Language-Image Pre-training"
26 / 26 papers shown
Title
Embedding Shift Dissection on CLIP: Effects of Augmentations on VLM's Representation Learning
Ashim Dahal
Saydul Akbar Murad
Nick Rahimi
VLM
96
0
0
30 Mar 2025
Agent-Centric Personalized Multiple Clustering with Multi-Modal LLMs
Ziye Chen
Yiqun Duan
Riheng Zhu
Zhenbang Sun
Biwei Huang
69
0
0
28 Mar 2025
Interpreting CLIP with Hierarchical Sparse Autoencoders
Vladimir Zaigrajew
Hubert Baniecki
P. Biecek
236
1
0
27 Feb 2025
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
96
325
0
04 Aug 2022
RegionCLIP: Region-based Language-Image Pretraining
Yiwu Zhong
Jianwei Yang
Pengchuan Zhang
Chunyuan Li
Noel Codella
...
Luowei Zhou
Xiyang Dai
Lu Yuan
Yin Li
Jianfeng Gao
VLM
CLIP
130
575
0
16 Dec 2021
CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
VLM
100
369
0
30 Nov 2021
A Simple Long-Tailed Recognition Baseline via Vision-Language Model
Teli Ma
Shijie Geng
Mengmeng Wang
Jing Shao
Jiasen Lu
Hongsheng Li
Peng Gao
Yu Qiao
VLM
83
47
0
29 Nov 2021
LiT: Zero-Shot Transfer with Locked-image text Tuning
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
92
556
0
15 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
274
1,040
0
09 Oct 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
251
2,819
0
15 Jun 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
150
1,862
0
05 Apr 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
426
1,127
0
17 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
116
661
0
11 Feb 2021
Big Self-Supervised Models Advance Medical Image Classification
Shekoofeh Azizi
Basil Mustafa
Fiona Ryan
Zach Beaver
Jan Freyberg
...
Alan Karthikesalingam
Simon Kornblith
Ting-Li Chen
Vivek Natarajan
Mohammad Norouzi
SSL
105
514
0
13 Jan 2021
A Survey on Contrastive Self-supervised Learning
Ashish Jaiswal
Ashwin Ramesh Babu
Mohammad Zaki Zadeh
Debapriya Banerjee
F. Makedon
SSL
122
1,391
0
31 Oct 2020
Pooling Methods in Deep Neural Networks, a Review
Hossein Gholamalinezhad
H. Khosravi
195
222
0
16 Sep 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
144
795
0
02 May 2020
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks
Mehdi Neshat
Zifan Wang
Bradley Alexander
Fan Yang
Zijian Zhang
Sirui Ding
Markus Wagner
Xia Hu
FAtt
91
1,066
0
03 Oct 2019
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
Zihao Wang
Xihui Liu
Hongsheng Li
Lu Sheng
Junjie Yan
Xiaogang Wang
Jing Shao
VLM
65
304
0
12 Sep 2019
A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI
Erico Tjoa
Cuntai Guan
XAI
92
1,446
0
17 Jul 2019
Exploring the Limits of Weakly Supervised Pretraining
D. Mahajan
Ross B. Girshick
Vignesh Ramanathan
Kaiming He
Manohar Paluri
Yixuan Li
Ashwin R. Bharambe
Laurens van der Maaten
VLM
180
1,367
0
02 May 2018
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
282
19,981
0
07 Oct 2016
Learning Deep Features for Discriminative Localization
Bolei Zhou
A. Khosla
Àgata Lapedriza
A. Oliva
Antonio Torralba
SSL
SSeg
FAtt
250
9,308
0
14 Dec 2015
Compact Bilinear Pooling
Yang Gao
Oscar Beijbom
Ning Zhang
Trevor Darrell
65
791
0
19 Nov 2015
Unsupervised Visual Representation Learning by Context Prediction
Carl Doersch
Abhinav Gupta
Alexei A. Efros
DRL
SSL
164
2,781
0
19 May 2015
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
589
15,876
0
12 Nov 2013
1