Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.04704
Cited By
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
10 April 2023
Shuhuai Ren
Aston Zhang
Yi Zhu
Shuai Zhang
Shuai Zheng
Mu Li
Alexander J. Smola
Xu Sun
VPVLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"
29 / 29 papers shown
Title
World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering
Jiacong Wang
Bohong Wu
Haiyong Jiang
Xun Zhou
Xin Xiao
Haoyuan Guo
Jun Xiao
VLM
VGen
36
4
0
30 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
43
1
0
10 Sep 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
41
7
0
05 Jul 2024
Robust Adaptation of Foundation Models with Black-Box Visual Prompting
Changdae Oh
Gyeongdeok Seo
Geunyoung Jung
Zhi-Qi Cheng
Hosik Choi
Jiyoung Jung
Kyungwoo Song
VLM
38
1
0
04 Jul 2024
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta
Alberto Baldrati
Marco Bertini
Andrew D. Bagdanov
VPVLM
VLM
35
6
0
03 Jul 2024
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection
Jiaming Li
Jiacheng Zhang
Jichang Li
Ge Li
Si Liu
Liang Lin
Guanbin Li
ObjD
VLM
48
13
0
01 Jun 2024
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Linli Yao
Lei Li
Shuhuai Ren
Lean Wang
Yuanxin Liu
Xu Sun
Lu Hou
35
28
0
31 May 2024
Enhancing Fine-Grained Image Classifications via Cascaded Vision Language Models
Canshi Wei
VLM
32
0
0
18 May 2024
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu
Tyler L. Hayes
Elisa Ricci
G. Csurka
Riccardo Volpi
ObjD
58
1
0
16 May 2024
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
Zheng Li
Xiang Li
Xinyi Fu
Xing Zhang
Weiqiang Wang
Shuo Chen
Jian Yang
VLM
39
35
0
05 Mar 2024
Simple Image-level Classification Improves Open-vocabulary Object Detection
Ru Fang
Guansong Pang
Xiaolong Bai
ObjD
VLM
53
14
0
16 Dec 2023
Auto-Vocabulary Semantic Segmentation
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
45
2
0
07 Dec 2023
GLaMM: Pixel Grounding Large Multimodal Model
H. Rasheed
Muhammad Maaz
Sahal Shaji Mullappilly
Abdelrahman M. Shaker
Salman Khan
Hisham Cholakkal
Rao M. Anwer
Erix Xing
Ming-Hsuan Yang
Fahad S. Khan
MLLM
VLM
41
201
0
06 Nov 2023
Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion
Hao Zhou
Tiancheng Shen
Xu Yang
Hai Huang
Xiangtai Li
Lu Qi
Ming-Hsuan Yang
86
12
0
06 Nov 2023
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren
Sishuo Chen
Shicheng Li
Xu Sun
Lu Hou
ViT
43
28
0
29 Oct 2023
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Chunlei Wang
Wenquan Feng
Xiangtai Li
Guangliang Cheng
Shuchang Lyu
Binghao Liu
Lijiang Chen
Qi Zhao
ObjD
VLM
26
9
0
22 Oct 2023
Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Liang Chen
Yichi Zhang
Shuhuai Ren
Haozhe Zhao
Zefan Cai
Yuchi Wang
Peiyi Wang
Tianyu Liu
Baobao Chang
LM&Ro
LLMAG
33
41
0
03 Oct 2023
Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment
Shengxiang Zhang
Muzammal Naseer
Guangyi Chen
Zhiqiang Shen
Salman Khan
Anton van den Hengel
F. Khan
VLM
60
4
0
24 Aug 2023
DPL: Decoupled Prompt Learning for Vision-Language Models
C. Xu
Yuhan Zhu
Guozhen Zhang
Haocheng Shen
Yixuan Liao
Xiaoxin Chen
Gangshan Wu
Limin Wang
VLM
21
4
0
19 Aug 2023
Link-Context Learning for Multimodal LLMs
Yan Tai
Weichen Fan
Zhao Zhang
Feng Zhu
Rui Zhao
Ziwei Liu
ReLM
LRM
21
17
0
15 Aug 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Towards Open Vocabulary Learning: A Survey
Jianzong Wu
Xiangtai Li
Shilin Xu
Haobo Yuan
Henghui Ding
...
Jiangning Zhang
Yu Tong
Xudong Jiang
Guohao Li
Dacheng Tao
ObjD
VLM
34
136
0
28 Jun 2023
Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification
Jintao Rong
Hao Chen
Tianrun Chen
Linlin Ou
Xinyi Yu
Yifan Liu
VLM
VPVLM
15
6
0
04 Jun 2023
Multi-modal Queried Object Detection in the Wild
Yifan Xu
Mengdan Zhang
Chaoyou Fu
Peixian Chen
Xiaoshan Yang
Ke Li
Changsheng Xu
ObjD
VLM
30
30
0
30 May 2023
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
203
531
0
06 Oct 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
186
282
0
15 Sep 2022
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
342
2,271
0
02 Sep 2021
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
181
687
0
22 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
301
3,708
0
11 Feb 2021
1