Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.11430
Cited By
Class-agnostic Object Detection with Multi-modal Transformer
22 November 2021
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
Rao Muhammad Anwer
Ming Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Class-agnostic Object Detection with Multi-modal Transformer"
50 / 66 papers shown
Title
Enhancing Target-unspecific Tasks through a Features Matrix
Fangming Cui
Yonggang Zhang
Xuan Wang
Xinmei Tian
Jun Yu
AAML
50
0
0
06 May 2025
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task
Ahmad Khalil
Mahmoud Khalil
A. Ngom
VLM
42
1
0
20 Apr 2025
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
ObjD
VLM
48
0
0
13 Mar 2025
Space Rotation with Basis Transformation for Training-free Test-Time Adaptation
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Xiang Song
Alex C. Kot
Yihong Gong
TTA
VLM
99
0
0
27 Feb 2025
YOLO-UniOW: Efficient Universal Open-World Object Detection
Lihao Liu
Juexiao Feng
Hui Chen
Ao Wang
Lin Song
J. Han
Guiguang Ding
ObjD
VLM
49
2
0
31 Dec 2024
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects
Zizhao Li
Zhengkang Xiang
Joseph West
Kourosh Khoshelham
ObjD
VLM
99
1
0
27 Nov 2024
3D Audio-Visual Segmentation
Artem Sokolov
Swapnil Bhosale
Xiatian Zhu
VOS
31
0
0
04 Nov 2024
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
39
0
0
15 Oct 2024
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Alex C. Kot
Yihong Gong
VLM
37
1
0
14 Oct 2024
O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out
Mısra Yavuz
Fatma Guney
28
0
0
10 Oct 2024
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
Wenjing Bian
Zirui Wang
Andrea Vedaldi
39
1
0
22 Aug 2024
Multimodal Foundational Models for Unsupervised 3D General Obstacle Detection
Tamás Matuszka
Peter Hajas
Dávid Szeghy
42
0
0
22 Aug 2024
Advancing Prompt Learning through an External Layer
Fangming Cui
Xun Yang
Chao Wu
Liang Xiao
Xinmei Tian
VLM
38
1
0
29 Jul 2024
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao
Xiaohan Ding
Juexiao Feng
Yuhong Yang
Hui Chen
Guiguang Ding
VLM
MQ
32
5
0
15 Jul 2024
XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images
Elisabeta-Iulia Dima
Pablo Gómez
Sandor Kruk
Peter Kretschmar
Simon Rosen
Călin-Adrian Popa
28
0
0
25 Jun 2024
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
Jia Syuen Lim
Zhuoxiao Chen
Mahsa Baktashmotlagh
Zhi Chen
Xin Yu
Zi Huang
Yadan Luo
VLM
ObjD
82
1
0
21 Jun 2024
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
Fangyi Chen
Han Zhang
Zhantao Yang
Hao Chen
Kai Hu
Marios Savvides
ObjD
VLM
41
5
0
30 May 2024
Multimodal Object Detection via Probabilistic a priori Information Integration
Hafsa El Hafyani
Bastien Pasdeloup
Camille Yver
Pierre Romenteau
25
0
0
24 May 2024
ChEX: Interactive Localization and Region Description in Chest X-rays
Philip Muller
Georgios Kaissis
Daniel Rueckert
35
5
0
24 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Jiangkang Deng
Xiatian Zhu
VOS
43
5
0
21 Mar 2024
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
Anjun Hu
Jindong Gu
Francesco Pinto
Konstantinos Kamnitsas
Philip H. S. Torr
AAML
SILM
37
5
0
19 Mar 2024
Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval
Shichao Kan
Yuhai Deng
Yixiong Liang
Lihui Cen
Zhe Qu
Yigang Cen
Zhihai He
40
0
0
16 Mar 2024
Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection
Jieren Deng
Haojian Zhang
Kun Ding
Jianhua Hu
Xingxuan Zhang
Yunkuan Wang
VLM
ObjD
82
4
0
04 Mar 2024
APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning
Guiming Cao
Kaize Shi
Hong Fu
Huaiwen Zhang
Guandong Xu
VLM
31
1
0
12 Jan 2024
YOLO-Former: YOLO Shakes Hand With ViT
J. Khoramdel
A. Moori
Y. Borhani
A. Ghanbarzadeh
Esmaeil Najafi
ViT
24
2
0
11 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
29
51
0
05 Jan 2024
COMMA: Co-Articulated Multi-Modal Learning
Lianyu Hu
Liqing Gao
Zekang Liu
Chi-Man Pun
Wei Feng
VLM
20
0
0
30 Dec 2023
Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model
Shuailei Ma
Chen-Wei Xie
Ying-yu Wei
Siyang Sun
Jiaqi Fan
Xiaoyi Bao
Yuxin Guo
Yun Zheng
VLM
VPVLM
26
2
0
18 Dec 2023
MobileSAMv2: Faster Segment Anything to Everything
Chaoning Zhang
Dongshen Han
Sheng Zheng
J. Choi
Tae-Ho Kim
Choong Seon Hong
VLM
27
23
0
15 Dec 2023
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection
Joonhyun Jeong
Geondo Park
Jayeon Yoo
Hyungsik Jung
Heesu Kim
VLM
ObjD
41
10
0
12 Dec 2023
Open World Object Detection in the Era of Foundation Models
O. Zohar
Alejandro Lozano
Shelly Goel
Serena Yeung
Kuan-Chieh Wang
VLM
31
9
0
10 Dec 2023
VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Yizhou Wang
Ruiyi Zhang
Haoliang Wang
Uttaran Bhattacharya
Yun Fu
Gang Wu
MLLM
32
10
0
04 Dec 2023
Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased Detector
Xuanyi Liu
Zhongqi Yue
Xian-Sheng Hua
17
0
0
04 Nov 2023
Towards Open World Active Learning for 3D Object Detection
Zhuoxiao Chen
Yadan Luo
Zixin Wang
Zijian Wang
Xin Yu
Zi Huang
32
0
0
16 Oct 2023
Leveraging Foundation models for Unsupervised Audio-Visual Segmentation
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Xiatian Zhu
VOS
47
5
0
13 Sep 2023
Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art
Aref Miri Rekavandi
Shima Rashidi
F. Boussaïd
Stephen Hoefs
Emre Akbas
Bennamoun
ViT
46
23
0
10 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
23
27
0
02 Sep 2023
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu
Mengdan Zhang
Xiaoshan Yang
Changsheng Xu
ObjD
32
5
0
30 Aug 2023
GenKL: An Iterative Framework for Resolving Label Ambiguity and Label Non-conformity in Web Images Via a New Generalized KL Divergence
Xia Huang
Kai Fong Ernest Chong
42
2
0
19 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Muhammad Uzair Khattak
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Ming Yang
F. Khan
VLM
23
166
0
13 Jul 2023
Understanding Prompt Tuning for V-L Models Through the Lens of Neural Collapse
Didi Zhu
Zexi Li
Min Zhang
Junkun Yuan
Yunfeng Shao
Jiashuo Liu
Kun Kuang
Yinchuan Li
Chao Wu
VLM
26
1
0
28 Jun 2023
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
34
136
0
28 Jun 2023
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection
T. Doan
Xin Li
Sima Behpour
Wenbin He
Liangke Gou
Liu Ren
23
7
0
25 Jun 2023
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
MLLM
29
587
0
08 Jun 2023
USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model
Yulin He
Wei Chen
Yusong Tan
Siqi Wang
18
8
0
04 Jun 2023
Multi-modal Queried Object Detection in the Wild
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjD
VLM
30
30
0
30 May 2023
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Zhiwei Jia
P. Narayana
Arjun Reddy Akula
G. Pruthi
Haoran Su
Sugato Basu
Varun Jampani
VLM
OffRL
15
4
0
28 May 2023
1
2
Next