Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.04539
Cited By
v1
v2
v3 (latest)
Auto-Vocabulary Semantic Segmentation
7 December 2023
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Auto-Vocabulary Semantic Segmentation"
37 / 37 papers shown
Title
TAG: Guidance-free Open-Vocabulary Semantic Segmentation
Yasufumi Kawano
Yoshimitsu Aoki
VLM
46
4
0
17 Mar 2024
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VLM
CLIP
74
148
0
04 Aug 2023
LISA: Reasoning Segmentation via Large Language Model
Xin Lai
Zhuotao Tian
Yukang Chen
Yanwei Li
Yuhui Yuan
Shu Liu
Jiaya Jia
LM&Ro
VLM
MLLM
LRM
128
457
0
01 Aug 2023
Going Denser with Open-Vocabulary Part Segmentation
Pei Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
ObjD
VLM
63
48
0
18 May 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
560
4,861
0
17 Apr 2023
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
Shuhuai Ren
Aston Zhang
Yi Zhu
Shuai Zhang
Shuai Zheng
Mu Li
Alexander J. Smola
Xu Sun
VPVLM
VLM
62
28
0
10 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
336
7,297
0
05 Apr 2023
Zero-guidance Segmentation Using Zero Segment Labels
Pitchaporn Rewatbowornwong
Nattanat Chatthee
Ekapol Chuangsuwanich
Supasorn Suwajanakorn
VLM
45
12
0
23 Mar 2023
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
Seokju Cho
Heeseong Shin
Sung‐Jin Hong
Anurag Arnab
Paul Hongsuck Seo
Seung Wook Kim
VLM
78
112
0
21 Mar 2023
Open-vocabulary Panoptic Segmentation with Embedding Modulation
Xi Chen
Shuang Li
Ser-Nam Lim
Antonio Torralba
Hengshuang Zhao
VLM
64
33
0
20 Mar 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
276
334
0
08 Mar 2023
A Language-Guided Benchmark for Weakly Supervised Open Vocabulary Semantic Segmentation
Prashant Pandey
Mustafa Chasmai
Monish Natarajan
Brejesh Lall
VLM
68
5
0
27 Feb 2023
Side Adapter Network for Open-Vocabulary Semantic Segmentation
Mengde Xu
Zheng Zhang
Fangyun Wei
Han Hu
Xiang Bai
VLM
73
264
0
23 Feb 2023
Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
Jilan Xu
Junlin Hou
Yuejie Zhang
Rui Feng
Yi Wang
Yu Qiao
Weidi Xie
VLM
62
86
0
22 Jan 2023
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou
Zi-Yi Dou
Jianwei Yang
Zhe Gan
Linjie Li
...
Lu Yuan
Nanyun Peng
Lijuan Wang
Yong Jae Lee
Jianfeng Gao
VLM
MLLM
ObjD
93
259
0
21 Dec 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya Zhang
Weidi Xie
VLM
64
48
0
27 Oct 2022
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIP
VLM
100
452
0
09 Oct 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
83
167
0
25 Aug 2022
Open-world Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
Quan Liu
Youpeng Wen
Jianhua Han
Chunjing Xu
Hang Xu
Xiaodan Liang
VLM
102
69
0
18 Jul 2022
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs
Tal Shaharabany
Yoad Tewel
Lior Wolf
ObjD
60
16
0
19 Jun 2022
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
208
79
0
26 May 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
289
526
0
22 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
542
4,398
0
28 Jan 2022
Language-driven Semantic Segmentation
Boyi Li
Kilian Q. Weinberger
Serge Belongie
V. Koltun
René Ranftl
VLM
122
625
0
10 Jan 2022
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model
Mengde Xu
Zheng Zhang
Fangyun Wei
Yutong Lin
Yue Cao
Han Hu
Xiang Bai
VLM
125
224
0
29 Dec 2021
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
VLM
124
382
0
22 Dec 2021
ClipCap: CLIP Prefix for Image Captioning
Ron Mokady
Amir Hertz
Amit H. Bermano
CLIP
VLM
71
679
0
18 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
694
6,079
0
29 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
964
29,731
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
664
41,103
0
22 Oct 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,226
0
27 Aug 2019
Zero-Shot Semantic Segmentation
Max Bucher
Tuan-Hung Vu
Matthieu Cord
P. Pérez
VLM
SSeg
150
320
0
03 Jun 2019
Detecting the Unexpected via Image Resynthesis
Krzysztof Lis
Krishna Kanth Nakka
Pascal Fua
Mathieu Salzmann
UQCV
52
178
0
16 Apr 2019
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
713
132,199
0
12 Jun 2017
The Cityscapes Dataset for Semantic Urban Scene Understanding
Marius Cordts
Mohamed Omran
Sebastian Ramos
Timo Rehfeld
Markus Enzweiler
Rodrigo Benenson
Uwe Franke
Stefan Roth
Bernt Schiele
1.1K
11,623
0
06 Apr 2016
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
80
497
0
12 Nov 2015
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Philipp Krahenbuhl
V. Koltun
132
3,452
0
20 Oct 2012
1