ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.11430
  4. Cited By
Class-agnostic Object Detection with Multi-modal Transformer

Class-agnostic Object Detection with Multi-modal Transformer

22 November 2021
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
Rao Muhammad Anwer
Ming Yang
ArXivPDFHTML

Papers citing "Class-agnostic Object Detection with Multi-modal Transformer"

50 / 66 papers shown
Title
Enhancing Target-unspecific Tasks through a Features Matrix
Enhancing Target-unspecific Tasks through a Features Matrix
Fangming Cui
Yonggang Zhang
Xuan Wang
Xinmei Tian
Jun Yu
AAML
50
0
0
06 May 2025
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task
ResNetVLLM -- Multi-modal Vision LLM for the Video Understanding Task
Ahmad Khalil
Mahmoud Khalil
A. Ngom
VLM
42
1
0
20 Apr 2025
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
ObjD
VLM
48
0
0
13 Mar 2025
Space Rotation with Basis Transformation for Training-free Test-Time Adaptation
Space Rotation with Basis Transformation for Training-free Test-Time Adaptation
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Xiang Song
Alex C. Kot
Yihong Gong
TTA
VLM
99
0
0
27 Feb 2025
YOLO-UniOW: Efficient Universal Open-World Object Detection
YOLO-UniOW: Efficient Universal Open-World Object Detection
Lihao Liu
Juexiao Feng
Hui Chen
Ao Wang
Lin Song
J. Han
Guiguang Ding
ObjD
VLM
49
2
0
31 Dec 2024
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects
Zizhao Li
Zhengkang Xiang
Joseph West
Kourosh Khoshelham
ObjD
VLM
99
1
0
27 Nov 2024
3D Audio-Visual Segmentation
3D Audio-Visual Segmentation
Artem Sokolov
Swapnil Bhosale
Xiatian Zhu
VOS
31
0
0
04 Nov 2024
Open World Object Detection: A Survey
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
39
0
0
15 Oct 2024
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Alex C. Kot
Yihong Gong
VLM
37
1
0
14 Oct 2024
O1O: Grouping of Known Classes to Identify Unknown Objects as
  Odd-One-Out
O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out
Mısra Yavuz
Fatma Guney
28
0
0
10 Oct 2024
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
CatFree3D: Category-agnostic 3D Object Detection with Diffusion
Wenjing Bian
Zirui Wang
Andrea Vedaldi
39
1
0
22 Aug 2024
Multimodal Foundational Models for Unsupervised 3D General Obstacle
  Detection
Multimodal Foundational Models for Unsupervised 3D General Obstacle Detection
Tamás Matuszka
Peter Hajas
Dávid Szeghy
42
0
0
22 Aug 2024
Advancing Prompt Learning through an External Layer
Advancing Prompt Learning through an External Layer
Fangming Cui
Xun Yang
Chao Wu
Liang Xiao
Xinmei Tian
VLM
38
1
0
29 Jul 2024
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in
  Streaming Videos
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
Quantized Prompt for Efficient Generalization of Vision-Language Models
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao
Xiaohan Ding
Juexiao Feng
Yuhong Yang
Hui Chen
Guiguang Ding
VLM
MQ
32
5
0
15 Jul 2024
XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical
  Images
XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images
Elisabeta-Iulia Dima
Pablo Gómez
Sandor Kruk
Peter Kretschmar
Simon Rosen
Călin-Adrian Popa
28
0
0
25 Jun 2024
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
Jia Syuen Lim
Zhuoxiao Chen
Mahsa Baktashmotlagh
Zhi Chen
Xin Yu
Zi Huang
Yadan Luo
VLM
ObjD
82
1
0
21 Jun 2024
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
Fangyi Chen
Han Zhang
Zhantao Yang
Hao Chen
Kai Hu
Marios Savvides
ObjD
VLM
41
5
0
30 May 2024
Multimodal Object Detection via Probabilistic a priori Information
  Integration
Multimodal Object Detection via Probabilistic a priori Information Integration
Hafsa El Hafyani
Bastien Pasdeloup
Camille Yver
Pierre Romenteau
25
0
0
24 May 2024
ChEX: Interactive Localization and Region Description in Chest X-rays
ChEX: Interactive Localization and Region Description in Chest X-rays
Philip Muller
Georgios Kaissis
Daniel Rueckert
35
5
0
24 Apr 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
Unsupervised Audio-Visual Segmentation with Modality Alignment
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Jiangkang Deng
Xiatian Zhu
VOS
43
5
0
21 Mar 2024
As Firm As Their Foundations: Can open-sourced foundation models be used
  to create adversarial examples for downstream tasks?
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
Anjun Hu
Jindong Gu
Francesco Pinto
Konstantinos Kamnitsas
Philip H. S. Torr
AAML
SILM
37
5
0
19 Mar 2024
Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for
  General Object Retrieval
Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval
Shichao Kan
Yuhai Deng
Yixiong Liang
Lihui Cen
Zhe Qu
Yigang Cen
Zhihai He
40
0
0
16 Mar 2024
Zero-shot Generalizable Incremental Learning for Vision-Language Object
  Detection
Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection
Jieren Deng
Haojian Zhang
Kun Ding
Jianhua Hu
Xingxuan Zhang
Yunkuan Wang
VLM
ObjD
82
4
0
04 Mar 2024
APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning
APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning
Guiming Cao
Kaize Shi
Hong Fu
Huaiwen Zhang
Guandong Xu
VLM
31
1
0
12 Jan 2024
YOLO-Former: YOLO Shakes Hand With ViT
YOLO-Former: YOLO Shakes Hand With ViT
J. Khoramdel
A. Moori
Y. Borhani
A. Ghanbarzadeh
Esmaeil Najafi
ViT
24
2
0
11 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes
  Interactively
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
29
51
0
05 Jan 2024
COMMA: Co-Articulated Multi-Modal Learning
COMMA: Co-Articulated Multi-Modal Learning
Lianyu Hu
Liqing Gao
Zekang Liu
Chi-Man Pun
Wei Feng
VLM
20
0
0
30 Dec 2023
Understanding the Multi-modal Prompts of the Pre-trained Vision-Language
  Model
Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model
Shuailei Ma
Chen-Wei Xie
Ying-yu Wei
Siyang Sun
Jiaqi Fan
Xiaoyi Bao
Yuxin Guo
Yun Zheng
VLM
VPVLM
26
2
0
18 Dec 2023
MobileSAMv2: Faster Segment Anything to Everything
MobileSAMv2: Faster Segment Anything to Everything
Chaoning Zhang
Dongshen Han
Sheng Zheng
J. Choi
Tae-Ho Kim
Choong Seon Hong
VLM
27
23
0
15 Dec 2023
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for
  Open-Vocabulary Object Detection
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection
Joonhyun Jeong
Geondo Park
Jayeon Yoo
Hyungsik Jung
Heesu Kim
VLM
ObjD
41
10
0
12 Dec 2023
Open World Object Detection in the Era of Foundation Models
Open World Object Detection in the Era of Foundation Models
O. Zohar
Alejandro Lozano
Shelly Goel
Serena Yeung
Kuan-Chieh Wang
VLM
31
9
0
10 Dec 2023
VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Yizhou Wang
Ruiyi Zhang
Haoliang Wang
Uttaran Bhattacharya
Yun Fu
Gang Wu
MLLM
32
10
0
04 Dec 2023
Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased
  Detector
Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased Detector
Xuanyi Liu
Zhongqi Yue
Xian-Sheng Hua
17
0
0
04 Nov 2023
Towards Open World Active Learning for 3D Object Detection
Towards Open World Active Learning for 3D Object Detection
Zhuoxiao Chen
Yadan Luo
Zixin Wang
Zijian Wang
Xin Yu
Zi Huang
32
0
0
16 Oct 2023
Leveraging Foundation models for Unsupervised Audio-Visual Segmentation
Leveraging Foundation models for Unsupervised Audio-Visual Segmentation
Swapnil Bhosale
Haosen Yang
Diptesh Kanojia
Xiatian Zhu
VOS
47
5
0
13 Sep 2023
Transformers in Small Object Detection: A Benchmark and Survey of
  State-of-the-Art
Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art
Aref Miri Rekavandi
Shima Rashidi
F. Boussaïd
Stephen Hoefs
Emre Akbas
Bennamoun
ViT
46
23
0
10 Sep 2023
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Contrastive Feature Masking Open-Vocabulary Vision Transformer
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
VLM
23
27
0
02 Sep 2023
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object
  Detection
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu
Mengdan Zhang
Xiaoshan Yang
Changsheng Xu
ObjD
32
5
0
30 Aug 2023
GenKL: An Iterative Framework for Resolving Label Ambiguity and Label
  Non-conformity in Web Images Via a New Generalized KL Divergence
GenKL: An Iterative Framework for Resolving Label Ambiguity and Label Non-conformity in Web Images Via a New Generalized KL Divergence
Xia Huang
Kai Fong Ernest Chong
42
2
0
19 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Self-regulating Prompts: Foundational Model Adaptation without
  Forgetting
Self-regulating Prompts: Foundational Model Adaptation without Forgetting
Muhammad Uzair Khattak
Syed Talal Wasim
Muzammal Naseer
Salman Khan
Ming Yang
F. Khan
VLM
23
166
0
13 Jul 2023
Understanding Prompt Tuning for V-L Models Through the Lens of Neural
  Collapse
Understanding Prompt Tuning for V-L Models Through the Lens of Neural Collapse
Didi Zhu
Zexi Li
Min Zhang
Junkun Yuan
Yunfeng Shao
Jiashuo Liu
Kun Kuang
Yinchuan Li
Chao Wu
VLM
26
1
0
28 Jun 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
34
136
0
28 Jun 2023
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic
  Distance Enhances Open World Object Detection
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection
T. Doan
Xin Li
Sima Behpour
Wenbin He
Liangke Gou
Liu Ren
23
7
0
25 Jun 2023
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and
  Language Models
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
MLLM
29
587
0
08 Jun 2023
USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and
  Segment Anything Model
USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model
Yulin He
Wei Chen
Yusong Tan
Siqi Wang
18
8
0
04 Jun 2023
Multi-modal Queried Object Detection in the Wild
Multi-modal Queried Object Detection in the Wild
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjD
VLM
30
30
0
30 May 2023
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature
  Adaptation of Vision-Language Models
KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Zhiwei Jia
P. Narayana
Arjun Reddy Akula
G. Pruthi
Haoran Su
Sugato Basu
Varun Jampani
VLM
OffRL
15
4
0
28 May 2023
12
Next