ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.12143
  4. Cited By
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
v1v2 (latest)

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels

22 December 2021
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
    VLM
ArXiv (abs)PDFHTML

Papers citing "Scaling Open-Vocabulary Image Segmentation with Image-Level Labels"

50 / 298 papers shown
Title
OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D
  Data
OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data
Shiyang Lu
Haonan Chang
E. Jing
Abdeslam Boularias
Kostas Bekris
90
58
0
06 Nov 2023
Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic
  Segmentation
Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation
Fei Zhang
Tianfei Zhou
Boyang Li
Hao He
Chaofan Ma
Tianjiao Zhang
Jiangchao Yao
Ya Zhang
Yanfeng Wang
VLM
125
21
0
29 Oct 2023
Drive Anywhere: Generalizable End-to-end Autonomous Driving with
  Multi-modal Foundation Models
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models
Tsun-Hsuan Wang
Alaa Maalouf
Wei Xiao
Yutong Ban
Alexander Amini
Guy Rosman
S. Karaman
Daniela Rus
73
46
0
26 Oct 2023
Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph
  prediction
Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction
Sebastian Koch
Pedro Hermosilla
Narunas Vaskevicius
Mirco Colosi
Timo Ropinski
96
11
0
25 Oct 2023
Open-NeRF: Towards Open Vocabulary NeRF Decomposition
Open-NeRF: Towards Open Vocabulary NeRF Decomposition
Hao Zhang
Fang Li
Narendra Ahuja
90
12
0
25 Oct 2023
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought
  Language Prompting
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
Lei Li
115
24
0
24 Oct 2023
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Chunlei Wang
Wenquan Feng
Xiangtai Li
Guangliang Cheng
Shuchang Lyu
Binghao Liu
Lijiang Chen
Qi Zhao
ObjDVLM
96
11
0
22 Oct 2023
CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement
CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement
Mohammadreza Salehi
Mehrdad Farajtabar
Maxwell Horton
Fartash Faghri
Hadi Pouransari
Raviteja Vemulapalli
Oncel Tuzel
Ali Farhadi
Mohammad Rastegari
Sachin Mehta
CLIPVLM
81
2
0
21 Oct 2023
SILC: Improving Vision Language Pretraining with Self-Distillation
SILC: Improving Vision Language Pretraining with Self-Distillation
Muhammad Ferjad Naeem
Yongqin Xian
Xiaohua Zhai
Lukas Hoyer
Luc Van Gool
F. Tombari
VLM
110
36
0
20 Oct 2023
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Jianwei Yang
Hao Zhang
Feng Li
Xueyan Zou
Chun-yue Li
Jianfeng Gao
MLLMVLM
128
189
0
17 Oct 2023
Towards Training-free Open-world Segmentation via Image Prompt
  Foundation Models
Towards Training-free Open-world Segmentation via Image Prompt Foundation Models
Lv Tang
Peng-Tao Jiang
Haoke Xiao
Bo Li
VLM
94
11
0
17 Oct 2023
Building an Open-Vocabulary Video CLIP Model with Better Architectures,
  Optimization and Data
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Zuxuan Wu
Zejia Weng
Wujian Peng
Xitong Yang
Ang Li
Larry S. Davis
Yu-Gang Jiang
CLIPVLM
95
22
0
08 Oct 2023
Compositional Semantics for Open Vocabulary Spatio-semantic
  Representations
Compositional Semantics for Open Vocabulary Spatio-semantic Representations
Robin Karlsson
Francisco Lepe-Salazar
K. Takeda
VLM
78
1
0
08 Oct 2023
ALT-Pilot: Autonomous navigation with Language augmented Topometric maps
ALT-Pilot: Autonomous navigation with Language augmented Topometric maps
Mohammad Omama
Pranav Inani
Pranjal Paul
Sarat Chandra Yellapragada
Krishna Murthy Jatavallabhula
Sandeep Chinchali
Madhava Krishna
64
16
0
03 Oct 2023
CLIP Is Also a Good Teacher: A New Learning Framework for Inductive
  Zero-shot Semantic Segmentation
CLIP Is Also a Good Teacher: A New Learning Framework for Inductive Zero-shot Semantic Segmentation
Jialei Chen
Daisuke Deguchi
Chenkai Zhang
Xu Zheng
Hiroshi Murase
VLM
124
9
0
03 Oct 2023
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object
  Detection
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Shilin Xu
Xiangtai Li
Size Wu
Wenwei Zhang
Yunhai Tong
Chen Change Loy
ObjDVLM
69
0
0
02 Oct 2023
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation
Siyu Jiao
Yunchao Wei
Yaowei Wang
Yao-Min Zhao
Humphrey Shi
VLM
108
50
0
30 Sep 2023
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and
  Planning
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Yuanyi Zhong
Alihusein Kuwajerwala
Sacha Morin
Krishna Murthy Jatavallabhula
Bipasha Sen
...
Celso Miguel de Melo
Joshua B. Tenenbaum
Antonio Torralba
Florian Shkurti
Liam Paull
LM&Ro
128
189
0
28 Sep 2023
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs
Haonan Chang
Kowndinya Boyalakuntla
Shiyang Lu
Siwei Cai
E. Jing
...
Shijie Geng
Adeeb Abbas
Lifeng Zhou
Kostas Bekris
Abdeslam Boularias
88
28
0
27 Sep 2023
Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features
Object-Centric Open-Vocabulary Image-Retrieval with Aggregated Features
Hila Levi
Guy Heller
Dan Levi
Ethan Fetaya
OCLVLM
74
4
0
26 Sep 2023
Unsupervised 3D Perception with 2D Vision-Language Distillation for
  Autonomous Driving
Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving
Mahyar Najibi
Jingwei Ji
Yin Zhou
C. Qi
Xinchen Yan
Scott Ettinger
Drago Anguelov
75
29
0
25 Sep 2023
CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic
  Segmentation For-Free
CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free
Monika Wysoczañska
Michael Ramamonjisoa
Tomasz Trzciñski
Oriane Siméoni
3DVVLM
124
22
0
25 Sep 2023
Rewrite Caption Semantics: Bridging Semantic Gaps for
  Language-Supervised Semantic Segmentation
Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
Yun Xing
Jian Kang
Aoran Xiao
Jiahao Nie
Ling Shao
Shijian Lu
VLM
89
13
0
24 Sep 2023
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary
  Instance Segmentation
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
Jiahao Xie
Wei Li
Xiangtai Li
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
DiffMVLM
154
37
0
22 Sep 2023
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language
  Model as an Agent
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&RoLLMAG
147
101
0
21 Sep 2023
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Adam Rashid
Satvik Sharma
Chung Min Kim
Justin Kerr
Lawrence Yunliang Chen
Angjoo Kanazawa
Ken Goldberg
154
94
0
14 Sep 2023
Panoptic Vision-Language Feature Fields
Panoptic Vision-Language Feature Fields
Haoran Chen
Kenneth Blomqvist
Francesco Milano
Roland Siegwart
VLM
84
14
0
11 Sep 2023
From Text to Mask: Localizing Entities Using the Attention of
  Text-to-Image Diffusion Models
From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
Changming Xiao
Qi Yang
Feng Zhou
Changshui Zhang
84
17
0
08 Sep 2023
Diffusion Model is Secretly a Training-free Open Vocabulary Semantic
  Segmenter
Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter
Jinglong Wang
Xiawei Li
Jing Zhang
Qingyuan Xu
Qin Zhou
Qian Yu
Lu Sheng
Dong Xu
VLMDiffM
113
48
0
06 Sep 2023
Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge
  for Generic Image Representations
Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations
Nikolaos-Antonios Ypsilantis
Kaifeng Chen
Bingyi Cao
Mário Lipovský
Pelin Dogan-Schönberger
Grzegorz Makosa
Boris Bluntschli
Mojtaba Seyedhosseini
Ondrej Chum
André Araujo
SSL
100
14
0
04 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjDVLM
120
27
0
02 Sep 2023
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
Zhening Huang
Xiaoyang Wu
Xi Chen
Hengshuang Zhao
Lei Zhu
Joan Lasenby
ISeg3DPCVLM
138
51
0
01 Sep 2023
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute
  Decomposition-Aggregation
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
Chaofan Ma
Yu-Hao Yang
Chen Ju
Fei Zhang
Ya Zhang
Yanfeng Wang
VLM
127
19
0
31 Aug 2023
CL-MAE: Curriculum-Learned Masked Autoencoders
CL-MAE: Curriculum-Learned Masked Autoencoders
Neelu Madan
Nicolae-Cătălin Ristea
Kamal Nasrollahi
T. Moeslund
Radu Tudor Ionescu
93
12
0
31 Aug 2023
Introducing Language Guidance in Prompt-based Continual Learning
Introducing Language Guidance in Prompt-based Continual Learning
Muhammad Gul Zain Ali Khan
Muhammad Ferjad Naeem
Luc Van Gool
D. Stricker
F. Tombari
Muhammad Zeshan Afzal
VLMCLL
105
51
0
30 Aug 2023
Shatter and Gather: Learning Referring Image Segmentation with Text
  Supervision
Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
Dongwon Kim
Nam-Won Kim
Cuiling Lan
Suha Kwak
VLM
98
20
0
29 Aug 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
138
55
0
21 Aug 2023
Open-vocabulary Video Question Answering: A New Benchmark for Evaluating
  the Generalizability of Video Question Answering Models
Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models
Dohwan Ko
Ji Soo Lee
M. Choi
Jaewon Chu
Jihwan Park
Hyunwoo J. Kim
55
6
0
18 Aug 2023
SegPrompt: Boosting Open-world Segmentation via Category-level Prompt
  Learning
SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning
Muzhi Zhu
Hengtao Li
Hao Chen
Chengxiang Fan
Wei Mao
Chenchen Jing
Yifan Liu
Chunhua Shen
VLM
70
17
0
12 Aug 2023
Follow Anything: Open-set detection, tracking, and following in
  real-time
Follow Anything: Open-set detection, tracking, and following in real-time
Alaa Maalouf
Ninad Jadhav
Krishna Murthy Jatavallabhula
Makram Chahine
Daniel M.Vogt
Robert J. Wood
Antonio Torralba
Daniela Rus
105
25
0
10 Aug 2023
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen
  Convolutional CLIP
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VLMCLIP
100
152
0
04 Aug 2023
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene
  Understanding
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding
Runyu Ding
Jihan Yang
Chuhui Xue
Wenqing Zhang
Song Bai
Xiaojuan Qi
3DVVLM
84
29
0
01 Aug 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming-Hsuan Yang
Fahad Shahbaz Khan
VLM
146
127
0
25 Jul 2023
Described Object Detection: Liberating Object Detection with Flexible
  Expressions
Described Object Detection: Liberating Object Detection with Flexible Expressions
Chi Xie
Zhao Zhang
YiXuan Wu
Feng Zhu
Rui Zhao
Shuang Liang
ObjD
89
35
0
24 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjDVLM
144
40
0
18 Jul 2023
Unified Open-Vocabulary Dense Visual Prediction
Unified Open-Vocabulary Dense Visual Prediction
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjDVLM
78
25
0
17 Jul 2023
TIAM -- A Metric for Evaluating Alignment in Text-to-Image Generation
TIAM -- A Metric for Evaluating Alignment in Text-to-Image Generation
P. Grimal
Hervé Le Borgne
Olivier Ferret
Julien Tourille
EGVM
133
12
0
11 Jul 2023
Hierarchical Open-vocabulary Universal Image Segmentation
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang
Shufang Li
Konstantinos Kallidromitis
Yu Kato
Kazuki Kozuka
Trevor Darrell
VLMOCL
126
41
0
03 Jul 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjDVLM
154
151
0
28 Jun 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
129
18
0
27 Jun 2023
Previous
123456
Next