ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.12143
  4. Cited By
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
v1v2 (latest)

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels

22 December 2021
Golnaz Ghiasi
Xiuye Gu
Huayu Chen
Nayeon Lee
    VLM
ArXiv (abs)PDFHTML

Papers citing "Scaling Open-Vocabulary Image Segmentation with Image-Level Labels"

50 / 298 papers shown
Title
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
Chenlu Zhan
Gaoang Wang
Hongwei Wang
3DV
24
0
0
16 Jun 2025
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Rong Wu
Ziqi Chen
Liming Zhong
Heng Li
Hai Shu
MedIm
38
0
0
15 Jun 2025
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots
Juno Kim
Yesol Park
Hye Jung Yoon
Byoung-Tak Zhang
76
0
0
13 Jun 2025
Vision Generalist Model: A Survey
Vision Generalist Model: A Survey
Ziyi Wang
Yongming Rao
Shuofeng Sun
Xinrun Liu
Yi Wei
...
Zuyan Liu
Yanbo Wang
Hongmin Liu
Jie Zhou
Jiwen Lu
68
0
0
11 Jun 2025
Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Siyu Chen
Ting Han
Chengzheng Fu
Changshe Zhang
Chaolei Wang
Jinhe Su
Guorong Cai
Meiliu Wu
ObjDVLM
95
0
0
11 Jun 2025
AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives
AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives
Aniruddh Sikdar
Aditya Gandhamal
Suresh Sundaram
VLM
62
0
0
04 Jun 2025
OV-COAST: Cost Aggregation with Optimal Transport for Open-Vocabulary Semantic Segmentation
OV-COAST: Cost Aggregation with Optimal Transport for Open-Vocabulary Semantic Segmentation
Aditya Gandhamal
Aniruddh Sikdar
Suresh Sundaram
OT
93
0
0
04 Jun 2025
LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM
LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM
Roman Titkov
Egor Zubkov
Dmitry A. Yudin
Jaafar Mahmoud
Malik Mohrat
Gennady Sidorov
3DGS
61
0
0
03 Jun 2025
The Missing Point in Vision Transformers for Universal Image Segmentation
The Missing Point in Vision Transformers for Universal Image Segmentation
Sajjad Shahabodini
Mobina Mansoori
Farnoush Bayatmakou
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
ViTISeg
33
0
0
26 May 2025
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation
Zuyao Chen
Jinlin Wu
Zhen Lei
Chang Wen Chen
49
0
0
26 May 2025
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation
Ziyu Zhao
Xiaoguang Li
Linjia Shi
Nasrin Imanpour
Song Wang
VLM
75
0
0
16 May 2025
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
Wenwen Qiang
Jianqi Zhang
Jingyao Wang
Changwen Zheng
VLM
141
0
0
10 May 2025
Visual Affordances: Enabling Robots to Understand Object Functionality
Visual Affordances: Enabling Robots to Understand Object Functionality
Tommaso Apicella
Alessio Xompero
Andrea Cavallaro
130
0
0
08 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
102
0
0
07 May 2025
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Gabriele Rosi
Fabio Cermelli
VLM
173
0
0
06 May 2025
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Yankai Jiang
Peng Zhang
Ke Wang
Yuan Tian
Hai Lin
Xinyu Wang
MedIm
452
0
0
05 May 2025
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Cues3D: Unleashing the Power of Sole NeRF for Consistent and Unique Instances in Open-Vocabulary 3D Panoptic Segmentation
Feng Xue
Wenzhuang Xu
Guofeng Zhong
Anlong Minga
N. Sebe
134
0
0
01 May 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&RoLRM
102
0
0
22 Apr 2025
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation
Junyuan Fang
Zihan Wang
Yanzhe Zhang
Shuzhe Wang
Iaroslav Melekhov
Arno Solin
VLM
88
0
0
20 Apr 2025
EmoSEM: Segment and Explain Emotion Stimuli in Visual Art
EmoSEM: Segment and Explain Emotion Stimuli in Visual Art
Jing Zhang
Dan Guo
Zhangbin Li
Meng Wang
87
0
0
20 Apr 2025
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering
Alexander Rusnak
Frédéric Kaplan
3DPC
79
0
0
18 Apr 2025
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation
Yasser Benigmim
Mohammad Fahes
Tuan-Hung Vu
Andrei Bursuc
Raoul de Charette
VLM
146
0
0
14 Apr 2025
FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment
FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment
Sebastián Barbas Laina
Simon Boche
Sotiris Papatheodorou
Simon Schaefer
Jaehyung Jung
Stefan Leutenegger
117
0
0
11 Apr 2025
DSM: Building A Diverse Semantic Map for 3D Visual Grounding
DSM: Building A Diverse Semantic Map for 3D Visual Grounding
Qinghongbing Xie
Zijian Liang
Long Zeng
99
0
0
11 Apr 2025
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Hritam Basak
Zhaozheng Yin
VLM
78
0
0
08 Apr 2025
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
Can Zhang
G. Lee
3DV
114
0
0
08 Apr 2025
Zero-Shot 4D Lidar Panoptic Segmentation
Zero-Shot 4D Lidar Panoptic Segmentation
Yushan Zhang
Aljosa Osep
Laura Leal-Taixé
Tim Meinhardt
3DPC
98
1
0
01 Apr 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Yansen Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
175
5
0
28 Mar 2025
OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
Christina Kassab
Sacha Morin
Martin Buchner
Matías Mattamala
Kumaraditya Gupta
Abhinav Valada
Liam Paull
Maurice F. Fallon
3DVELM
72
0
0
25 Mar 2025
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
Vladan Stojnić
Yannis Kalantidis
Jirí Matas
Giorgos Tolias
VLM
124
0
0
25 Mar 2025
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
Jiaxin Huang
Runnan Chen
Ziwen Li
Zhengqing Gao
Xiao He
Yandong Guo
Mingming Gong
Tongliang Liu
LRM
106
1
0
23 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
494
2
0
20 Mar 2025
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Zhaochong An
Guolei Sun
Yun Liu
Runjia Li
Junlin Han
Ender Konukoglu
Serge Belongie
VLM
171
3
0
20 Mar 2025
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting
Di Li
Jie Feng
Jiahao Chen
Weisheng Dong
Guanbin Li
G. Shi
Licheng Jiao
3DGSVLM
433
0
0
14 Mar 2025
SAS: Segment Any 3D Scene with Integrated 2D Priors
Zechao Li
Jiahao Lu
Jiacheng Deng
Hanzhi Chang
Lifan Wu
Yanzhe Liang
Tianzhu Zhang
112
0
0
11 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
Jiawei Han
Guiguang Ding
VLMObjD
134
6
0
10 Mar 2025
Towards Universal Text-driven CT Image Segmentation
Yuheng Li
Yuxiang Lai
Maria Thor
Deborah Marshall
Zachary Buchwald
D. Yu
Xiaofeng Yang
MedImVLM
117
3
0
08 Mar 2025
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang
Huilong Pi
Ruihui Li
Yunchuan Qin
Zhuo Tang
KenLi Li
92
2
0
08 Mar 2025
Vision-based 3D Semantic Scene Completion via Capture Dynamic Representations
Vision-based 3D Semantic Scene Completion via Capture Dynamic Representations
Meng Wang
Fan Wu
Yunchuan Qin
Ruihui Li
Zhuo Tang
KenLi Li
3DPC
146
0
0
08 Mar 2025
Open-Vocabulary Semantic Part Segmentation of 3D Human
Open-Vocabulary Semantic Part Segmentation of 3D Human
Keito Suzuki
Bang Du
Girish Krishnan
Kunyao Chen
Runfa Li
Truong Thao Nguyen
3DHVLM
152
0
0
27 Feb 2025
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Xingyu Miao
Haoran Duan
Yang Bai
Tejal Shah
Jun Song
Yang Long
R. Ranjan
Ling Shao
161
5
0
31 Jan 2025
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Benjamin Kiefer
Lojze Žust
Jon Muhovič
Matej Kristan
J. Pers
...
Ashraf Saleem
Ching-Heng Cheng
Yu-Fan Lin
Tzu-Yu Lin
Chih-Chung Hsu
77
1
0
20 Jan 2025
DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data
DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data
Yuanpeng Tu
Xi Chen
Ser-Nam Lim
Hengshuang Zhao
190
1
0
03 Jan 2025
User Willingness-aware Sales Talk Dataset
User Willingness-aware Sales Talk Dataset
Asahi Hentona
Jun Baba
Shiki Sato
Reina Akama
111
6
0
27 Dec 2024
LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding
LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding
Hao Li
Roy Qin
Zhengyu Zou
Diqi He
Yangqiu Song
Bingquan Dai
Dingewn Zhang
Jiawei Han
3DGS
122
2
0
23 Dec 2024
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level
  Vision-Language Alignment
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language Alignment
Cijo Jose
Théo Moutakanni
Dahyun Kang
Federico Baldassarre
Timothée Darcet
...
Maxime Oquab
Oriane Siméoni
Huy V. Vo
Patrick Labatut
Piotr Bojanowski
CLIPVLM
178
8
0
20 Dec 2024
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic
  Segmentation
Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation
J. Zhang
Li Zhang
Shijian Li
VLM
177
0
0
18 Dec 2024
RelationField: Relate Anything in Radiance Fields
RelationField: Relate Anything in Radiance Fields
Sebastian Koch
Johanna Wald
Mirco Colosi
Narunas Vaskevicius
Pedro Hermosilla
F. Tombari
Timo Ropinski
178
1
0
18 Dec 2024
Open-World Panoptic Segmentation
Open-World Panoptic Segmentation
Matteo Sodano
Federico Magistri
Jens Behley
Cyrill Stachniss
VLM
159
0
0
17 Dec 2024
Towards Open-Vocabulary Video Semantic Segmentation
Towards Open-Vocabulary Video Semantic Segmentation
Xuelong Li
Yun Liu
Guolei Sun
Min Wu
Le Zhang
Ce Zhu
VLMVOS
136
2
0
12 Dec 2024
123456
Next