ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14940
  4. Cited By
Learning to Prompt for Open-Vocabulary Object Detection with
  Vision-Language Model

Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model

28 March 2022
Yu Du
Fangyun Wei
Zihe Zhang
Miaojing Shi
Yue Gao
Guoqi Li
    VPVLM
    VLM
ArXivPDFHTML

Papers citing "Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model"

50 / 244 papers shown
Title
Open-vocabulary Video Question Answering: A New Benchmark for Evaluating
  the Generalizability of Video Question Answering Models
Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models
Dohwan Ko
Ji Soo Lee
M. Choi
Jaewon Chu
Jihwan Park
Hyunwoo J. Kim
22
5
0
18 Aug 2023
Taming Self-Training for Open-Vocabulary Object Detection
Taming Self-Training for Open-Vocabulary Object Detection
Shiyu Zhao
S. Schulter
Long Zhao
Zhixing Zhang
Vijay Kumar B.G
Yumin Suh
Manmohan Chandraker
Dimitris N. Metaxas
VLM
ObjD
37
12
0
11 Aug 2023
Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained
  Vision-Language Models
Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models
Kecheng Zheng
Wei Wu
Ruili Feng
Kai Zhu
Jiawei Liu
Deli Zhao
Zhengjun Zha
Wei Chen
Yujun Shen
VLM
29
8
0
27 Jul 2023
Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?
Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?
Cheng-En Wu
Yu Tian
Haichao Yu
Heng Wang
Pedro Morgado
Yu Hen Hu
Linjie Yang
NoLa
VPVLM
VLM
37
18
0
22 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Unified Open-Vocabulary Dense Visual Prediction
Unified Open-Vocabulary Dense Visual Prediction
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjD
VLM
43
19
0
17 Jul 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment
  Anything for Helping People with Visual Impairments
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
27
15
0
15 Jul 2023
Open-Vocabulary Object Detection via Scene Graph Discovery
Open-Vocabulary Object Detection via Scene Graph Discovery
Hengcan Shi
Munawar Hayat
Jianfei Cai
ObjD
16
12
0
07 Jul 2023
Prompting classes: Exploring the Power of Prompt Class Learning in
  Weakly Supervised Semantic Segmentation
Prompting classes: Exploring the Power of Prompt Class Learning in Weakly Supervised Semantic Segmentation
Balamurali Murugesan
Rukhshanda Hussain
Rajarshi Bhattacharya
Ismail Ben Ayed
Jose Dolz
VLM
VPVLM
26
4
0
30 Jun 2023
Towards Open Vocabulary Learning: A Survey
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
34
136
0
28 Jun 2023
Text Promptable Surgical Instrument Segmentation with Vision-Language
  Models
Text Promptable Surgical Instrument Segmentation with Vision-Language Models
Zijian Zhou
Oluwatosin O. Alabi
Meng Wei
Tom Kamiel Magda Vercauteren
Miaojing Shi
MedIm
30
23
0
15 Jun 2023
World-to-Words: Grounded Open Vocabulary Acquisition through Fast
  Mapping in Vision-Language Models
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
Ziqiao Ma
Jiayi Pan
J. Chai
ObjD
VLM
21
8
0
14 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
33
7
0
14 Jun 2023
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Seoyeon Kim
Minguk Kang
Dongwon Kim
Jaesik Park
Suha Kwak
VLM
27
10
0
14 Jun 2023
Learning Domain-Aware Detection Head with Prompt Tuning
Learning Domain-Aware Detection Head with Prompt Tuning
Haochen Li
Rui Zhang
Hantao Yao
Xinkai Song
Yifan Hao
Yongwei Zhao
Ling Li
Yunji Chen
VLM
27
14
0
09 Jun 2023
UniBoost: Unsupervised Unimodal Pre-training for Boosting Zero-shot
  Vision-Language Tasks
UniBoost: Unsupervised Unimodal Pre-training for Boosting Zero-shot Vision-Language Tasks
Yanan Sun
Zi-Qi Zhong
Qi Fan
Chi-Keung Tang
Yu-Wing Tai
VLM
33
4
0
07 Jun 2023
Fine-Grained Visual Prompting
Fine-Grained Visual Prompting
Lingfeng Yang
Yueze Wang
Xiang Li
Xinlong Wang
Jian Yang
ObjD
VLM
32
60
0
07 Jun 2023
LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning
Atsuyuki Miyai
Qing Yu
Go Irie
Kiyoharu Aizawa
OODD
32
64
0
02 Jun 2023
LOWA: Localize Objects in the Wild with Attributes
LOWA: Localize Objects in the Wild with Attributes
Xiaoyuan Guo
Kezhen Chen
Jinmeng Rao
Yawen Zhang
Baochen Sun
Jie Yang
ObjD
43
2
0
31 May 2023
Multi-modal Queried Object Detection in the Wild
Multi-modal Queried Object Detection in the Wild
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjD
VLM
30
30
0
30 May 2023
Contextual Object Detection with Multimodal Large Language Models
Contextual Object Detection with Multimodal Large Language Models
Yuhang Zang
Wei Li
Jun Han
Kaiyang Zhou
Chen Change Loy
ObjD
VLM
MLLM
32
78
0
29 May 2023
Discovering Novel Actions from Open World Egocentric Videos with
  Object-Grounded Visual Commonsense Reasoning
Discovering Novel Actions from Open World Egocentric Videos with Object-Grounded Visual Commonsense Reasoning
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LRM
LM&Ro
27
1
0
26 May 2023
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal
  Distribution Alignment
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment
Runqi Wang
Hao Zheng
Xiaoyue Duan
Jianzhuang Liu
Yuning Lu
Tian Wang
Songcen Xu
Baochang Zhang
VLM
26
12
0
19 May 2023
Going Denser with Open-Vocabulary Part Segmentation
Going Denser with Open-Vocabulary Part Segmentation
Pei Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
ObjD
VLM
27
45
0
18 May 2023
Mobile User Interface Element Detection Via Adaptively Prompt Tuning
Mobile User Interface Element Detection Via Adaptively Prompt Tuning
Zhangxuan Gu
Zhuoer Xu
Haoxing Chen
Jun Lan
Changhua Meng
Weiqiang Wang
19
4
0
16 May 2023
Region-Aware Pretraining for Open-Vocabulary Object Detection with
  Vision Transformers
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Dahun Kim
A. Angelova
Weicheng Kuo
ObjD
ViT
VLM
30
73
0
11 May 2023
Vision-Language Models in Remote Sensing: Current Progress and Future
  Trends
Vision-Language Models in Remote Sensing: Current Progress and Future Trends
Xiang Li
Congcong Wen
Yuan Hu
Zhenghang Yuan
Xiao Xiang Zhu
VLM
21
71
0
09 May 2023
Hypernymization of named entity-rich captions for grounding-based
  multi-modal pretraining
Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining
Giacomo Nebbia
Adriana Kovashka
17
0
0
25 Apr 2023
OVTrack: Open-Vocabulary Multiple Object Tracking
OVTrack: Open-Vocabulary Multiple Object Tracking
Siyuan Li
Tobias Fischer
Lei Ke
Henghui Ding
Martin Danelljan
F. I. F. Richard Yu
DiffM
30
44
0
17 Apr 2023
Progressive Visual Prompt Learning with Contrastive Feature Re-formation
Progressive Visual Prompt Learning with Contrastive Feature Re-formation
C. Xu
Yuhan Zhu
Haocheng Shen
Fengyuan Shi
Boheng Chen
Yixuan Liao
Xiaoxin Chen
Limin Wang
VLM
36
20
0
17 Apr 2023
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic
  Segmentation
TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation
Jingyao Li
Pengguang Chen
Shengju Qian
Jiaya Jia
VLM
32
13
0
15 Apr 2023
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary
  Visual Recognition
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
Shuhuai Ren
Aston Zhang
Yi Zhu
Shuai Zhang
Shuai Zheng
Mu Li
Alexander J. Smola
Xu Sun
VPVLM
VLM
24
28
0
10 Apr 2023
Defense-Prefix for Preventing Typographic Attacks on CLIP
Defense-Prefix for Preventing Typographic Attacks on CLIP
Hiroki Azuma
Yusuke Matsui
VLM
AAML
20
17
0
10 Apr 2023
V3Det: Vast Vocabulary Visual Detection Dataset
V3Det: Vast Vocabulary Visual Detection Dataset
Jiaqi Wang
Pan Zhang
Tao Chu
Yuhang Cao
Yujie Zhou
Tong Wu
Bin Wang
Conghui He
Dahua Lin
VLM
ObjD
29
52
0
07 Apr 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
33
74
0
06 Apr 2023
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Jiayi Guo
Chaofei Wang
You Wu
Eric Zhang
Kai Wang
Xingqian Xu
S. Song
Humphrey Shi
Gao Huang
DiffM
VLM
79
29
0
06 Apr 2023
Learning to Name Classes for Vision and Language Models
Learning to Name Classes for Vision and Language Models
Sarah Parisot
Yongxin Yang
Steven G. McDonagh
VLM
17
10
0
04 Apr 2023
Towards Open-Vocabulary Video Instance Segmentation
Towards Open-Vocabulary Video Instance Segmentation
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOS
VLM
23
29
0
04 Apr 2023
RegionPLC: Regional Point-Language Contrastive Learning for Open-World
  3D Scene Understanding
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
Jihan Yang
Runyu Ding
Weipeng Deng
Zhe Wang
Xiaojuan Qi
20
62
0
03 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
41
483
0
03 Apr 2023
Zero-shot Referring Image Segmentation with Global-Local Context
  Features
Zero-shot Referring Image Segmentation with Global-Local Context Features
S. Yu
Paul Hongsuck Seo
Jeany Son
6
49
0
31 Mar 2023
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual
  Mask Annotations
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
VS Vibashan
Ning Yu
Chen Xing
Can Qin
M. Gao
Juan Carlos Niebles
Vishal M. Patel
Ran Xu
VLM
ISeg
30
18
0
29 Mar 2023
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Weicheng Kuo
A. Piergiovanni
Dahun Kim
Xiyang Luo
Benjamin Caine
...
Luowei Zhou
Andrew M. Dai
Zhifeng Chen
Claire Cui
A. Angelova
MLLM
VLM
29
23
0
29 Mar 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
33
42
0
28 Mar 2023
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
Yue Zhang
Suchen Wang
Shichao Kan
Zhenyu Weng
Yigang Cen
Yap-Peng Tan
ViT
37
3
0
26 Mar 2023
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object
  Detection
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection
Hwanjun Song
Jihwan Bang
VLM
ObjD
29
14
0
25 Mar 2023
Three ways to improve feature alignment for open vocabulary detection
Three ways to improve feature alignment for open vocabulary detection
Relja Arandjelović
A. Andonian
A. Mensch
Olivier J. Hénaff
Jean-Baptiste Alayrac
Andrew Zisserman
VLM
ObjD
33
19
0
23 Mar 2023
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive
  Learning
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
Yiting Cheng
Fangyun Wei
Jianmin Bao
Dong Chen
Wenqian Zhang
SLR
24
28
0
22 Mar 2023
Natural Language-Assisted Sign Language Recognition
Natural Language-Assisted Sign Language Recognition
Ronglai Zuo
Fangyun Wei
Brian Mak
SLR
23
37
0
21 Mar 2023
Detecting Everything in the Open World: Towards Universal Object
  Detection
Detecting Everything in the Open World: Towards Universal Object Detection
Zhenyu Wang
Yali Li
Xi Chen
Ser-Nam Lim
Antonio Torralba
Hengshuang Zhao
Shengjin Wang
ObjD
VLM
32
77
0
21 Mar 2023
Previous
12345
Next