Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12442
Cited By
ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
17 July 2024
Mengcheng Lan
Chaofeng Chen
Yiping Ke
Xinjiang Wang
Xue Jiang
Wayne Zhang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference"
17 / 17 papers shown
Title
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
SPNeRF: Open Vocabulary 3D Neural Scene Segmentation with Superpoints
Weiwen Hu
Niccolò Parodi
Marcus Zepp
I. Feldmann
O. Schreer
Peter Eisert
VLM
141
0
0
19 Mar 2025
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang
Fangchen Liu
Letian Fu
Tingfan Wu
Mustafa Mukadam
Jitendra Malik
Ken Goldberg
Pieter Abbeel
LM&Ro
VLM
85
6
0
05 Mar 2025
DynamicEarth: How Far are We from Open-Vocabulary Change Detection?
Kaiyu Li
Xiangyong Cao
Yupeng Deng
Chao Pang
Zepeng Xin
Deyu Meng
Zhi Wang
ObjD
69
1
0
22 Jan 2025
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
79
0
0
26 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
79
3
0
24 Nov 2024
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
M. Arda Aydın
Efe Mert Çırpar
Elvin Abdinli
Gözde B. Ünal
Y. Sahin
VLM
71
0
0
18 Nov 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
44
7
0
05 Jul 2024
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
Yunheng Li
Zhongyu Li
Quansheng Zeng
Qibin Hou
Ming-Ming Cheng
VLM
48
8
0
02 Jun 2024
CLIP-S
4
^4
4
: Language-Guided Self-Supervised Semantic Segmentation
Wenbin He
Suphanut Jamonnak
Liangke Gou
Liu Ren
VLM
43
31
0
01 May 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
223
320
0
08 Mar 2023
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
192
499
0
22 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
335
5,785
0
29 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
313
3,708
0
11 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
262
525
0
04 Feb 2021
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
253
1,828
0
18 Aug 2016
1