ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05493
  4. Cited By
Multi-Modal Classifiers for Open-Vocabulary Object Detection

Multi-Modal Classifiers for Open-Vocabulary Object Detection

8 June 2023
Prannay Kaul
Weidi Xie
Andrew Zisserman
    ObjD
    VLM
    MLLM
ArXivPDFHTML

Papers citing "Multi-Modal Classifiers for Open-Vocabulary Object Detection"

44 / 44 papers shown
Title
Decoupled Global-Local Alignment for Improving Compositional Understanding
Decoupled Global-Local Alignment for Improving Compositional Understanding
Xiaoxing Hu
Kaicheng Yang
Jianmin Wang
Haoran Xu
Ziyong Feng
Yansen Wang
VLM
183
0
0
23 Apr 2025
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Yongchao Feng
Yajie Liu
Shuai Yang
Wenrui Cai
Jingyang Zhang
...
Jiahui Lv
Ziqiang Liu
Tengyuan Shi
Qingjie Liu
Yansen Wang
MLLM
VLM
65
1
0
13 Apr 2025
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
Louis Y. Kim
Michelle Karker
Victoria Valledor
Seiyoung C. Lee
Karl F. Brzoska
Margaret Duff
Anthony Palladino
VLM
ObjD
68
0
0
21 Mar 2025
Real Classification by Description: Extending CLIP's Limits of Part
  Attributes Recognition
Real Classification by Description: Extending CLIP's Limits of Part Attributes Recognition
Ethan Baron
Idan Tankel
Peter Tu
Guy Ben-Yosef
VLM
87
0
0
18 Dec 2024
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Yansen Wang
Ming-Hsuan Yang
VLM
84
2
0
26 Nov 2024
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object
  Detection Considering Text Describability
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability
Yusuke Hosoya
Masanori Suganuma
Takayuki Okatani
ObjD
23
0
0
20 Oct 2024
Open World Object Detection: A Survey
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
50
0
0
15 Oct 2024
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in
  Open-Vocabulary Detection
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection
Zishuo Wang
Wenhao Zhou
Jinglin Xu
Yuxin Peng
ObjD
VLM
29
1
0
08 Oct 2024
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking
Zekun Qian
Ruize Han
Wei Feng
Junhui Hou
Linqi Song
Song Wang
44
1
0
19 Jul 2024
CoAPT: Context Attribute words for Prompt Tuning
CoAPT: Context Attribute words for Prompt Tuning
Gun Lee
Subin An
Sungyong Baik
Soochahn Lee
VPVLM
VLM
35
1
0
18 Jul 2024
Open Vocabulary Multi-Label Video Classification
Open Vocabulary Multi-Label Video Classification
Rohit Gupta
Mamshad Nayeem Rizve
Jayakrishnan Unnikrishnan
Ashish Tawari
Son Tran
Mubarak Shah
Benjamin Z. Yao
Trishul Chilimbi
VLM
67
1
0
12 Jul 2024
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Akshita Gupta
Aditya Arora
Sanath Narayan
Salman Khan
Fahad Shahbaz Khan
Graham W. Taylor
43
3
0
21 Jun 2024
OVMR: Open-Vocabulary Recognition with Multi-Modal References
OVMR: Open-Vocabulary Recognition with Multi-Modal References
Zehong Ma
Shiliang Zhang
Longhui Wei
Qi Tian
VLM
46
0
0
07 Jun 2024
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
Mohamed El Amine Boudjoghra
Angela Dai
Jean Lahoud
Hisham Cholakkal
Rao Muhammad Anwer
Salman Khan
Fahad Shahbaz Khan
VLM
ISeg
83
6
0
04 Jun 2024
An Information Compensation Framework for Zero-Shot Skeleton-based
  Action Recognition
An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition
Haojun Xu
Yanlei Gao
Jie Li
Xinbo Gao
48
2
0
02 Jun 2024
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
M. Rusanovsky
Or Hirschorn
S. Avidan
37
3
0
01 Jun 2024
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu
Tyler L. Hayes
Elisa Ricci
G. Csurka
Riccardo Volpi
ObjD
61
1
0
16 May 2024
Open-Vocabulary Object Detection via Neighboring Region Attention
  Alignment
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
Sunyuan Qiang
Xianfei Li
Yanyan Liang
Wenlong Liao
Tao He
Pai Peng
ObjD
43
0
0
14 May 2024
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection
Jiaqi Zhu
Shaofeng Cai
Fang Deng
Junran Wu
Junran Wu
70
15
0
15 Apr 2024
Training-free Boost for Open-Vocabulary Object Detection with Confidence
  Aggregation
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation
Yanhao Zheng
Kai Liu
ObjD
26
1
0
12 Apr 2024
Exploring the Potential of Large Foundation Models for Open-Vocabulary
  HOI Detection
Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
Ting Lei
Shaofeng Yin
Yang Liu
VLM
49
9
0
09 Apr 2024
Retrieval-Augmented Open-Vocabulary Object Detection
Retrieval-Augmented Open-Vocabulary Object Detection
Jooyeon Kim
Eulrang Cho
Sehyung Kim
Hyunwoo J. Kim
VLM
ObjD
51
8
0
08 Apr 2024
Open-Vocabulary Object Detectors: Robustness Challenges under
  Distribution Shifts
Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts
Prakash Chandra Chhipa
Kanjar De
Meenakshi Subhash Chippa
Rajkumar Saini
Marcus Liwicki
ObjD
VLM
41
1
0
01 Apr 2024
Generative Region-Language Pretraining for Open-Ended Object Detection
Generative Region-Language Pretraining for Open-Ended Object Detection
Chuang Lin
Yi Jiang
Lizhen Qu
Zehuan Yuan
Jianfei Cai
ObjD
VLM
53
13
0
15 Mar 2024
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Renovating Names in Open-Vocabulary Segmentation Benchmarks
Haiwen Huang
Songyou Peng
Dan Zhang
Andreas Geiger
VLM
37
3
0
14 Mar 2024
Exploring Robust Features for Few-Shot Object Detection in Satellite
  Imagery
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery
Xavier Bou
Gabriele Facciolo
R. G. V. Gioi
Jean-Michel Morel
T. Ehret
ObjD
49
2
0
08 Mar 2024
Multi-modal Attribute Prompting for Vision-Language Models
Multi-modal Attribute Prompting for Vision-Language Models
Xin Liu
Jiamin Wu
and Wenfei Yang
Xu Zhou
Tianzhu Zhang
VLM
29
10
0
01 Mar 2024
A SOUND APPROACH: Using Large Language Models to generate audio
  descriptions for egocentric text-audio retrieval
A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval
Andreea-Maria Oncescu
João F. Henriques
Andrew Zisserman
Samuel Albanie
A. Sophia Koepke
38
5
0
29 Feb 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
Democratizing Fine-grained Visual Recognition with Large Language Models
Mingxuan Liu
Subhankar Roy
Wenjing Li
Zhun Zhong
N. Sebe
Elisa Ricci
VLM
44
10
0
24 Jan 2024
MaskClustering: View Consensus based Mask Graph Clustering for
  Open-Vocabulary 3D Instance Segmentation
MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
Mi Yan
JIazhao Zhang
Yan Zhu
H. Wang
3DV
ISeg
34
29
0
15 Jan 2024
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
P. Nguyen
T.D. Ngo
E. Kalogerakis
Chuang Gan
Anh Tran
Cuong Pham
Khoi Duc Minh Nguyen
ISeg
40
51
0
17 Dec 2023
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for
  General Video Recognition
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tom Tongjia Chen
Hongshan Yu
Zhengeng Yang
Zechuan Li
Wei Sun
Chen Chen
23
8
0
30 Nov 2023
Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy
  Tradeoff for Out-of-Distribution Few-shot Learning
Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning
Christopher Liao
Theodoros Tsiligkaridis
Brian Kulis
OODD
53
5
0
21 Nov 2023
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Chunlei Wang
Wenquan Feng
Xiangtai Li
Guangliang Cheng
Shuchang Lyu
Binghao Liu
Lijiang Chen
Qi Zhao
ObjD
VLM
31
10
0
22 Oct 2023
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object
  Detection
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Shilin Xu
Xiangtai Li
Size Wu
Wenwei Zhang
Yunhai Tong
Chen Change Loy
ObjD
VLM
34
0
0
02 Oct 2023
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
21
3
0
29 Sep 2023
Detect Everything with Few Examples
Detect Everything with Few Examples
Xinyu Zhang
Yuting Wang
Abdeslam Boularias
ObjD
VLM
37
13
0
22 Sep 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
36
33
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
47
137
0
28 Jun 2023
Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Viorica Puatruaucean
Lucas Smaira
Ankush Gupta
Adrià Recasens Continente
L. Markeeva
...
Y. Aytar
Simon Osindero
Dima Damen
Andrew Zisserman
João Carreira
VLM
137
145
0
23 May 2023
What does a platypus look like? Generating customized prompts for
  zero-shot image classification
What does a platypus look like? Generating customized prompts for zero-shot image classification
Sarah M Pratt
Ian Covert
Rosanne Liu
Ali Farhadi
VLM
133
216
0
07 Sep 2022
ImageNet-21K Pretraining for the Masses
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
187
690
0
22 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
343
3,726
0
11 Feb 2021
Frustratingly Simple Few-Shot Object Detection
Frustratingly Simple Few-Shot Object Detection
Xin Wang
Thomas E. Huang
Trevor Darrell
Joseph E. Gonzalez
Feng Yu
ObjD
104
544
0
16 Mar 2020
1