ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.15639
  4. Cited By
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language
  Models

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

30 September 2022
Weicheng Kuo
Huayu Chen
Xiuye Gu
A. Piergiovanni
A. Angelova
    MLLM
    VLM
    ObjD
ArXivPDFHTML

Papers citing "F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models"

50 / 114 papers shown
Title
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
40
0
0
07 May 2025
VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning
VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning
Run Luo
Renke Shan
Longze Chen
Zichen Liu
Lu Wang
Min Yang
Xiaobo Xia
MLLM
VLM
99
0
0
28 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
103
2
0
17 Apr 2025
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Yongchao Feng
Yajie Liu
Shuai Yang
Wenrui Cai
Jingyang Zhang
...
Jiahui Lv
Zichen Liu
Tengyuan Shi
Qingjie Liu
Yansen Wang
MLLM
VLM
63
1
0
13 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
235
0
0
04 Apr 2025
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
Chaojun Ni
Xiaofeng Wang
Zheng Zhu
Wei Wang
Haoyun Li
Guosheng Zhao
Jie Li
Wenkang Qin
Guan Huang
Wenjun Mei
3DGS
ViT
VGen
233
1
0
03 Apr 2025
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Congpei Qiu
Yanhao Wu
Wei Ke
Xiuxiu Bai
Tong Zhang
VLM
52
0
0
03 Apr 2025
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
Hongyu Sun
Qiuhong Ke
Ming Cheng
Yunhong Wang
Deying Li
Chenhui Gou
Jianfei Cai
3DPC
92
0
0
15 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
227
0
0
14 Mar 2025
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
ObjD
VLM
48
0
0
13 Mar 2025
Vi-LAD: Vision-Language Attention Distillation for Socially-Aware Robot Navigation in Dynamic Environments
Mohamed Bashir Elnoor
K. Weerakoon
Gershom Seneviratne
Jing Liang
Vignesh Rajagopal
Dinesh Manocha
58
0
0
12 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
Jiawei Han
Guiguang Ding
VLM
ObjD
80
1
0
10 Mar 2025
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Chenyang Zhao
Kun Wang
J. H. Hsiao
Antoni B. Chan
CLIP
73
0
0
26 Feb 2025
Leveraging Content and Context Cues for Low-Light Image Enhancement
Leveraging Content and Context Cues for Low-Light Image Enhancement
Igor Morawski
Kai He
Shusil Dangi
Winston H. Hsu
96
0
0
10 Dec 2024
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Wentao Bao
Keqin Li
Yuxiao Chen
Deep Patel
Martin Renqiang Min
Yu Kong
VLM
ObjD
52
2
0
17 Nov 2024
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object
  Detection Considering Text Describability
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability
Yusuke Hosoya
Masanori Suganuma
Takayuki Okatani
ObjD
21
0
0
20 Oct 2024
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Chenhang Cui
An Zhang
Yiyang Zhou
Zhaorun Chen
Gelei Deng
Huaxiu Yao
Tat-Seng Chua
73
4
0
18 Oct 2024
VidEgoThink: Assessing Egocentric Video Understanding Capabilities for
  Embodied AI
VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI
Sijie Cheng
Kechen Fang
Yangyang Yu
Sicheng Zhou
Yangqiu Song
Ye Tian
Tingguang Li
Lei Han
Yang Liu
56
8
0
15 Oct 2024
Open World Object Detection: A Survey
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
45
0
0
15 Oct 2024
ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization
Jiawei Li
Fanrui Zhang
Jiaying Zhu
Esther Sun
Qiang Zhang
Zheng-jun Zha
MLLM
59
9
0
14 Oct 2024
OW-Rep: Open World Object Detection with Instance Representation Learning
OW-Rep: Open World Object Detection with Instance Representation Learning
Sunoh Lee
Minsik Jeon
Jihong Min
Junwon Seo
ObjD
224
0
0
24 Sep 2024
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting
Yongqi Wang
Xinxiao Wu
Shuo Yang
Jiebo Luo
212
1
0
19 Sep 2024
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary
  Segmentation
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
Xi Chen
Haosen Yang
Sheng Jin
Xiatian Zhu
H. Yao
VLM
29
3
0
05 Sep 2024
MarvelOVD: Marrying Object Recognition and Vision-Language Models for
  Robust Open-Vocabulary Object Detection
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang
Lechao Cheng
Weikai Chen
Pingping Zhang
Liang Lin
Fan Zhou
Guanbin Li
VLM
ObjD
36
2
0
31 Jul 2024
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular
  Transformer
Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular Transformer
Yang Wu
Kaihua Zhang
Jianjun Qian
Jin Xie
Jian Yang
DiffM
47
4
0
29 Jul 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided
  Self-Distillation
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang
Yuxi Wang
Shuai Li
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
52
3
0
18 Jul 2024
CerberusDet: Unified Multi-Task Object Detection
CerberusDet: Unified Multi-Task Object Detection
Irina Tolstykh
Mikhail Chernyshov
Maksim Kuprashevich
VLM
ObjD
56
0
0
17 Jul 2024
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Shuyang Lin
Tong Jia
Hao Wang
Bowen Ma
Mingyuan Li
Dongyue Chen
VLM
ObjD
43
0
0
16 Jun 2024
OVMR: Open-Vocabulary Recognition with Multi-Modal References
OVMR: Open-Vocabulary Recognition with Multi-Modal References
Zehong Ma
Shiliang Zhang
Longhui Wei
Qi Tian
VLM
44
0
0
07 Jun 2024
SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model
SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model
An-Chieh Cheng
Hongxu Yin
Yang Fu
Qiushan Guo
Ruihan Yang
Jan Kautz
Xiaolong Wang
Sifei Liu
LRM
64
48
0
03 Jun 2024
Learning Background Prompts to Discover Implicit Knowledge for Open
  Vocabulary Object Detection
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection
Jiaming Li
Jiacheng Zhang
Jichang Li
Ge Li
Si Liu
Liang Lin
Guanbin Li
ObjD
VLM
58
13
0
01 Jun 2024
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection
Fangyi Chen
Han Zhang
Zhantao Yang
Hao Chen
Kai Hu
Marios Savvides
ObjD
VLM
46
5
0
30 May 2024
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and
  Open-World Unknown Objects Supervision
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang
Bin Chen
Bin Kang
Yulin Li
Yichi Chen
Weizhi Xian
Huifeng Chang
VLM
ObjD
36
7
0
28 May 2024
Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance
  for Low-Light Image Enhancement
Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement
Igor Morawski
Kai He
Shusil Dangi
Winston H. Hsu
VLM
59
2
0
19 May 2024
Open-Vocabulary Spatio-Temporal Action Detection
Open-Vocabulary Spatio-Temporal Action Detection
Tao Wu
Shuqiu Ge
Jie Qin
Gangshan Wu
Limin Wang
ObjD
28
5
0
17 May 2024
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Mingxuan Liu
Tyler L. Hayes
Elisa Ricci
G. Csurka
Riccardo Volpi
ObjD
61
1
0
16 May 2024
Open-Vocabulary Object Detection via Neighboring Region Attention
  Alignment
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
Sunyuan Qiang
Xianfei Li
Yanyan Liang
Wenlong Liao
Tao He
Pai Peng
ObjD
43
0
0
14 May 2024
Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection
Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection
Zhaoxiang Zhang
Hanqiu Deng
Jinan Bao
Xingyu Li
VLM
36
1
0
08 May 2024
On the Foundations of Earth and Climate Foundation Models
On the Foundations of Earth and Climate Foundation Models
Xiao Xiang Zhu
Zhitong Xiong
Yi Wang
Adam J. Stewart
Konrad Heidler
Yuanyuan Wang
Zhenghang Yuan
Thomas Dujardin
Qingsong Xu
Yilei Shi
AI4Cl
AI4CE
39
21
0
07 May 2024
Curriculum Point Prompting for Weakly-Supervised Referring Image
  Segmentation
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Qiyuan Dai
Sibei Yang
34
8
0
18 Apr 2024
The devil is in the object boundary: towards annotation-free instance
  segmentation using Foundation Models
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
Cheng Shi
Sibei Yang
VLM
42
3
0
18 Apr 2024
Vocabulary-free Image Classification and Semantic Segmentation
Vocabulary-free Image Classification and Semantic Segmentation
Alessandro Conti
Enrico Fini
Massimiliano Mancini
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
43
2
0
16 Apr 2024
COCONut: Modernizing COCO Segmentation
COCONut: Modernizing COCO Segmentation
XueQing Deng
Qihang Yu
Peng Wang
Xiaohui Shen
Liang-Chieh Chen
48
16
0
12 Apr 2024
Segment Any 3D Object with Language
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
49
1
0
02 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
Jienneg Chen
Qihang Yu
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
3DV
VLM
47
25
0
02 Apr 2024
Open-Vocabulary Object Detectors: Robustness Challenges under
  Distribution Shifts
Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts
Prakash Chandra Chhipa
Kanjar De
Meenakshi Subhash Chippa
Rajkumar Saini
Marcus Liwicki
ObjD
VLM
41
1
0
01 Apr 2024
Open-Set Recognition in the Age of Vision-Language Models
Open-Set Recognition in the Age of Vision-Language Models
Dimity Miller
Niko Sünderhauf
Alex Kenna
Keita Mason
VLM
42
3
0
25 Mar 2024
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual
  Font Applications
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications
Yuki Tatsukawa
I-Chao Shen
Anran Qi
Yuki Koyama
Takeo Igarashi
Ariel Shamir
CLIP
VLM
33
5
0
11 Mar 2024
Multi-modal Attribute Prompting for Vision-Language Models
Multi-modal Attribute Prompting for Vision-Language Models
Xin Liu
Jiamin Wu
and Wenfei Yang
Xu Zhou
Tianzhu Zhang
VLM
29
10
0
01 Mar 2024
Prospector Heads: Generalized Feature Attribution for Large Models &
  Data
Prospector Heads: Generalized Feature Attribution for Large Models & Data
Gautam Machiraju
Alexander Derry
Arjun D Desai
Neel Guha
Amir-Hossein Karimi
James Zou
Russ Altman
Christopher Ré
Parag Mallick
AI4TS
MedIm
50
0
0
18 Feb 2024
123
Next