Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.04150
Cited By
v1
v2
v3 (latest)
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
9 October 2022
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIP
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP"
50 / 66 papers shown
Title
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin
Yue Hu
Jiangtao Shen
Yunhang Shen
Liujuan Cao
Shengchuan Zhang
Chia-Wen Lin
ObjD
VLM
195
0
0
26 May 2025
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Feng Xue
Wenzhuang Xu
Guofeng Zhong
Anlong Minga
N. Sebe
123
0
0
01 May 2025
LGD: Leveraging Generative Descriptions for Zero-Shot Referring Image Segmentation
Jiachen Li
Qing Xie
Xiaohan Yu
Hongyun Wang
Jinyu Xu
Yongjian Liu
ObjD
141
0
0
20 Apr 2025
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Shiu-hong Kao
Yu-Wing Tai
Chi-Keung Tang
LRM
MLLM
237
1
0
10 Mar 2025
ZeroPS: High-quality Cross-modal Knowledge Transfer for Zero-Shot 3D Part Segmentation
Yuheng Xue
Nenglun Chen
Jun Liu
Wenyun Sun
3DPC
222
7
0
24 Feb 2025
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Xingyu Miao
Haoran Duan
Yang Bai
Tejal Shah
Jun Song
Yang Long
R. Ranjan
Ling Shao
154
5
0
31 Jan 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
202
12
0
23 Jan 2025
DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data
Yuanpeng Tu
Xi Chen
Ser-Nam Lim
Hengshuang Zhao
164
1
0
03 Jan 2025
ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model
Kunyang Han
Yibo Hu
Mengxue Qu
Hailin Shi
Yao Zhao
Y. X. Wei
MLLM
VLM
3DV
229
1
0
29 Nov 2024
CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections
Mohamed Fazli Mohamed Imam
Rufael Fedaku Marew
Jameel Hassan
Mustansar Fiaz
Alham Fikri Aji
Hisham Cholakkal
VLM
511
1
0
28 Nov 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
257
1
0
26 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
293
7
0
24 Nov 2024
ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements
M. Arda Aydın
Efe Mert Çırpar
Elvin Abdinli
Gözde B. Ünal
Y. Sahin
VLM
265
1
0
18 Nov 2024
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Shehan Munasinghe
Hanan Gani
Wenqi Zhu
Jiale Cao
Eric P. Xing
Fahad Shahbaz Khan
Salman Khan
MLLM
VGen
VLM
94
8
0
07 Nov 2024
Multiple Information Prompt Learning for Cloth-Changing Person Re-Identification
Shengxun Wei
Zan Gao
Yibo Zhao
Weili Guan
Weili Guan
Shengyong Chen
125
2
0
01 Nov 2024
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding
Guangda Ji
Silvan Weder
Francis Engelmann
Marc Pollefeys
Hermann Blum
3DV
127
4
0
17 Oct 2024
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han
Liang Du
Hongwei Du
Xiangguo Zhou
Yiwen Wu
Weibo Zheng
Donghong Han
CLL
MoMe
MoE
67
4
0
10 Oct 2024
Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models
Shuoyuan Wang
Yixuan Li
Hongxin Wei
VLM
120
2
0
03 Oct 2024
Search3D: Hierarchical Open-Vocabulary 3D Segmentation
Ayca Takmaz
Alexandros Delitzas
R. Sumner
Francis Engelmann
Johanna Wald
Federico Tombari
145
13
0
27 Sep 2024
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Guofeng Mei
Luigi Riz
Yiming Wang
Fabio Poiesi
ISeg
VLM
115
4
0
20 Aug 2024
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
184
18
0
16 Aug 2024
F-LMM: Grounding Frozen Large Multimodal Models
Size Wu
Sheng Jin
Wenwei Zhang
Lumin Xu
Wentao Liu
Wei Li
Chen Change Loy
MLLM
140
15
0
09 Jun 2024
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
Mohamed El Amine Boudjoghra
Angela Dai
Jean Lahoud
Hisham Cholakkal
Rao Muhammad Anwer
Salman Khan
Fahad Shahbaz Khan
VLM
ISeg
160
6
0
04 Jun 2024
Proxy Denoising for Source-Free Domain Adaptation
Song Tang
Wenxin Su
Mao Ye
Jianwei Zhang
Xiatian Zhu
Xiatian Zhu
137
2
0
03 Jun 2024
O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation
Muer Tie
Julong Wei
Zhengjun Wang
Ke Wu
Shansuai Yuan
Kaizhao Zhang
Jie Jia
Jieru Zhao
Zhongxue Gan
Wenchao Ding
104
6
0
10 Apr 2024
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model
Huan Ma
Yan Zhu
Changqing Zhang
Peilin Zhao
Baoyuan Wu
Long-Kai Huang
Qinghua Hu
Bing Wu
VLM
111
2
0
01 Mar 2024
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Ci-Siang Lin
Chien-Yi Wang
Yu-Chiang Frank Wang
Min-Hung Chen
VLM
223
0
0
22 Jan 2024
Auto-Vocabulary Semantic Segmentation
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
125
2
0
07 Dec 2023
End-to-End Breast Cancer Radiotherapy Planning via LMMs with Consistency Embedding
Kwanyoung Kim
Y. Oh
S. Park
H. Byun
Joongyo Lee
Jin Sung Kim
Yong Bae Kim
Jong Chul Ye
97
0
0
27 Nov 2023
Side Adapter Network for Open-Vocabulary Semantic Segmentation
Mengde Xu
Zheng Zhang
Fangyun Wei
Han Hu
Xiang Bai
VLM
75
265
0
23 Feb 2023
ZegOT: Zero-shot Segmentation Through Optimal Transport of Text Prompts
Kwanyoung Kim
Y. Oh
Jong Chul Ye
VLM
OT
CLIP
64
20
0
28 Jan 2023
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Huaishao Luo
Junwei Bao
Youzheng Wu
Xiaodong He
Tianrui Li
VLM
101
153
0
27 Nov 2022
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
Wenhui Wang
Hangbo Bao
Li Dong
Johan Bjorck
Zhiliang Peng
...
Kriti Aggarwal
O. Mohammed
Saksham Singhal
Subhojit Som
Furu Wei
MLLM
VLM
ViT
148
644
0
22 Aug 2022
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
136
567
0
17 May 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
133
380
0
18 Apr 2022
Exploring Visual Prompts for Adapting Large-Scale Models
Hyojin Bahng
Ali Jahanian
S. Sankaranarayanan
Phillip Isola
VLM
VPVLM
LRM
68
272
0
31 Mar 2022
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
Yu Du
Fangyun Wei
Zihe Zhang
Miaojing Shi
Yue Gao
Guoqi Li
VPVLM
VLM
81
334
0
28 Mar 2022
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
155
1,641
0
23 Mar 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
289
527
0
22 Feb 2022
Language-driven Semantic Segmentation
Boyi Li
Kilian Q. Weinberger
Serge Belongie
V. Koltun
René Ranftl
VLM
124
625
0
10 Jan 2022
Data Efficient Language-supervised Zero-shot Recognition with Optimal Transport Distillation
Bichen Wu
Rui Cheng
Peizhao Zhang
Tianren Gao
Peter Vajda
Joseph E. Gonzalez
VLM
84
45
0
17 Dec 2021
RegionCLIP: Region-based Language-Image Pretraining
Yiwu Zhong
Jianwei Yang
Pengchuan Zhang
Chunyuan Li
Noel Codella
...
Luowei Zhou
Xiyang Dai
Lu Yuan
Yin Li
Jianfeng Gao
VLM
CLIP
151
580
0
16 Dec 2021
Decoupling Zero-Shot Semantic Segmentation
Jian Ding
Nan Xue
Guisong Xia
Dengxin Dai
VLM
106
195
0
15 Dec 2021
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLM
CLIP
152
458
0
11 Oct 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
505
2,409
0
02 Sep 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
228
3,989
0
28 Jul 2021
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Bowen Cheng
Alex Schwing
Alexander Kirillov
VLM
ViT
210
1,551
0
13 Jul 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
293
920
0
28 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
587
4,084
0
18 Apr 2021
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen-Or
Dani Lischinski
CLIP
VLM
129
1,209
0
31 Mar 2021
1
2
Next