ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.02605
  4. Cited By
Detecting Twenty-thousand Classes using Image-level Supervision

Detecting Twenty-thousand Classes using Image-level Supervision

7 January 2022
Xingyi Zhou
Rohit Girdhar
Armand Joulin
Phillip Krahenbuhl
Ishan Misra
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Detecting Twenty-thousand Classes using Image-level Supervision"

50 / 128 papers shown
Title
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation
Jianghang Lin
Yilin Lu
Yunhang Shen
Chaoyang Zhu
Shengchuan Zhang
Liujuan Cao
Rongrong Ji
ISeg
26
0
0
16 May 2025
FG-CLIP: Fine-Grained Visual and Textual Alignment
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
53
0
0
08 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Yulin Chen
Zhuotao Tian
VLM
38
0
0
07 May 2025
CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion
CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion
Boyuan Meng
Xinming Zhang
Peilin Li
Zhe Wu
Yiming Li
Wenkai Zhao
B. Yu
Hui-Liang Shen
ViT
93
0
0
02 May 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
44
0
0
16 Apr 2025
Measuring Déjà vu Memorization Efficiently
Measuring Déjà vu Memorization Efficiently
Narine Kokhlikyan
Bargav Jayaraman
Florian Bordes
Chuan Guo
Kamalika Chaudhuri
30
1
0
08 Apr 2025
Post-processing for Fair Regression via Explainable SVD
Post-processing for Fair Regression via Explainable SVD
Zhiqun Zuo
Ding Zhu
Mohammad Mahdi Khalili
157
0
0
04 Apr 2025
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
SyncSDE: A Probabilistic Framework for Diffusion Synchronization
Hyunjun Lee
Hyunsoo Lee
Sookwan Han
DiffM
48
0
0
27 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Zhiqiang Zhang
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
73
1
0
25 Mar 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Zhichao Sun
Huazhang Hu
Yidong Ma
Gang Liu
Nemo Chen
Xu Tang
Yao Hu
Yongchao Xu
ObjD
47
0
0
24 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
168
0
0
20 Mar 2025
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark
Ying Liu
Yijing Hua
Haojiang Chai
Yanbo Wang
TengQi Ye
ObjD
62
0
0
19 Mar 2025
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
Chuhan Zhang
Chaoyang Zhu
Pingcheng Dong
Long Chen
Dong Zhang
ObjD
VLM
164
0
0
14 Mar 2025
InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
InPK: Infusing Prior Knowledge into Prompt for Vision-Language Models
Shuchang Zhou
Jiwei Wei
Shiyuan He
Yuyang Zhou
Chaoning Zhang
Jie Zou
Ning Xie
Yang Yang
VLM
VPVLM
81
0
0
27 Feb 2025
QueryAdapter: Rapid Adaptation of Vision-Language Models in Response to Natural Language Queries
QueryAdapter: Rapid Adaptation of Vision-Language Models in Response to Natural Language Queries
N. H. Chapman
Feras Dayoub
Will N. Browne
Christopher F. Lehnert
VLM
77
0
0
26 Feb 2025
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition
Khanh Nguyen
Ghulam Mubashar Hassan
Ajmal Saeed Mian
3DPC
54
0
0
15 Feb 2025
Adaptive Grasping of Moving Objects in Dense Clutter via Global-to-Local Detection and Static-to-Dynamic Planning
Adaptive Grasping of Moving Objects in Dense Clutter via Global-to-Local Detection and Static-to-Dynamic Planning
Hao Chen
Takuya Kiyokawa
Weiwei Wan
Kensuke Harada
56
0
0
09 Feb 2025
3D Reconstruction of Shoes for Augmented Reality
3D Reconstruction of Shoes for Augmented Reality
Pratik Shrestha
Sujan Kapali
Swikar Gautam
Vishal Pokharel
Santosh Giri
59
0
0
29 Jan 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
65
0
0
28 Jan 2025
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
Xiangyu Gao
Yu Dai
Benliu Qiu
Hongliang Li
Heqian Qiu
Hongliang Li
ObjD
VLM
151
0
0
28 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
F. Khan
ObjD
VLM
160
1
0
17 Jan 2025
Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites
Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites
Abdalwhab Abdalwhab
A. Imran
Sina Heydarian
I. Iordanova
David St-Onge
49
0
0
16 Jan 2025
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Interacted Object Grounding in Spatio-Temporal Human-Object Interactions
Xiaoyang Liu
Boran Wen
Xinpeng Liu
Zizheng Zhou
Hongwei Fan
Cewu Lu
Lizhuang Ma
Yulong Chen
Yongqian Li
56
2
0
27 Dec 2024
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
108
2
0
24 Nov 2024
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Chenhang Cui
An Zhang
Yiyang Zhou
Zhaorun Chen
Gelei Deng
Huaxiu Yao
Tat-Seng Chua
70
4
0
18 Oct 2024
Open World Object Detection: A Survey
Open World Object Detection: A Survey
Yiming Li
Yi Wang
Wenqian Wang
Dan Lin
Bingbing Li
Kim-Hui Yap
ObjD
39
0
0
15 Oct 2024
ImagineNav: Prompting Vision-Language Models as Embodied Navigator
  through Scene Imagination
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Xinxin Zhao
Wenzhe Cai
Likun Tang
Teng Wang
LM&Ro
37
3
0
13 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through
  Language Descriptions
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
22
7
0
03 Oct 2024
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary
  Segmentation
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
Xi Chen
Haosen Yang
Sheng Jin
Xiatian Zhu
H. Yao
VLM
29
3
0
05 Sep 2024
Optimizing CLIP Models for Image Retrieval with Maintained
  Joint-Embedding Alignment
Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment
Konstantin Schall
Kai Uwe Barthel
Nico Hezel
Klaus Jung
VLM
36
3
0
03 Sep 2024
Open-Ended 3D Point Cloud Instance Segmentation
Open-Ended 3D Point Cloud Instance Segmentation
Phuc D. A. Nguyen
Minh Luu
Anh Tran
Cuong Pham
Khoi Nguyen
3DPC
53
1
0
21 Aug 2024
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Guofeng Mei
Luigi Riz
Yiming Wang
Fabio Poiesi
ISeg
VLM
61
3
0
20 Aug 2024
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in
  Streaming Videos
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
Towards Open-World Mobile Manipulation in Homes: Lessons from the
  Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Sriram Yenamandra
Arun Ramachandran
Mukul Khanna
Karmesh Yadav
Jay Vakil
...
Z. Kira
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
59
6
0
09 Jul 2024
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP
Shuyang Lin
Tong Jia
Hao Wang
Bowen Ma
Mingyuan Li
Dongyue Chen
VLM
ObjD
41
0
0
16 Jun 2024
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive
  Through Work Zones
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones
Anurag Ghosh
R. Tamburo
Shen Zheng
Juan R. Alvarez-Padilla
Hailiang Zhu
Michael Cardei
Nicholas Dunn
Christoph Mertz
Srinivasa G. Narasimhan
44
1
0
11 Jun 2024
Matching Anything by Segmenting Anything
Matching Anything by Segmenting Anything
Siyuan Li
Lei Ke
Martin Danelljan
Luigi Piccinelli
Mattia Segu
Luc Van Gool
Fisher Yu
VOS
37
22
0
06 Jun 2024
SIGMA: An Open-Source Interactive System for Mixed-Reality Task
  Assistance Research
SIGMA: An Open-Source Interactive System for Mixed-Reality Task Assistance Research
D. Bohus
Sean Andrist
Nick Saw
Ann Paradiso
Ishani Chakraborty
Mahdi Rad
38
9
0
16 May 2024
Exposing Text-Image Inconsistency Using Diffusion Models
Exposing Text-Image Inconsistency Using Diffusion Models
Mingzhen Huang
Shan Jia
Zhou Zhou
Yan Ju
Jialing Cai
Siwei Lyu
46
7
0
28 Apr 2024
Segment Any 3D Object with Language
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
44
1
0
02 Apr 2024
Cross-domain Multi-modal Few-shot Object Detection via Rich Text
Cross-domain Multi-modal Few-shot Object Detection via Rich Text
Zeyu Shangguan
Daniel Seita
Mohammad Rostami
ObjD
52
1
0
24 Mar 2024
Attention-based Class-Conditioned Alignment for Multi-Source Domain
  Adaptation of Object Detectors
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors
Atif Belal
Akhil Meethal
Francisco Perdigon Romero
M. Pedersoli
Eric Granger
43
1
0
14 Mar 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
66
85
0
27 Feb 2024
Practice Makes Perfect: Planning to Learn Skill Parameter Policies
Practice Makes Perfect: Planning to Learn Skill Parameter Policies
Nishanth Kumar
Tom Silver
Willie McClinton
Linfeng Zhao
Stephen Proulx
Tomás Lozano-Pérez
L. Kaelbling
Jennifer Barry
55
18
0
22 Feb 2024
SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot
  Interaction
SInViG: A Self-Evolving Interactive Visual Agent for Human-Robot Interaction
Jie Xu
Hanbo Zhang
Xinghang Li
Huaping Liu
Xuguang Lan
Tao Kong
LM&Ro
35
3
0
19 Feb 2024
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm
  Surveillance
Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance
Raza Imam
Muhammad Huzaifa
Nabil Mansour
Shaher Bano Mirza
Fouad Lamghari
20
0
0
10 Feb 2024
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
Chengjian Feng
Yujie Zhong
Zequn Jie
Weidi Xie
Lin Ma
ObjD
35
13
0
08 Feb 2024
Spatial-Aware Latent Initialization for Controllable Image Generation
Spatial-Aware Latent Initialization for Controllable Image Generation
Wenqiang Sun
Tengtao Li
Zehong Lin
Jun Zhang
39
10
0
29 Jan 2024
Adaptive Mobile Manipulation for Articulated Objects In the Open World
Adaptive Mobile Manipulation for Articulated Objects In the Open World
Haoyu Xiong
Russell Mendonca
Kenneth Shaw
Deepak Pathak
34
38
0
25 Jan 2024
Domain Adaptation for Large-Vocabulary Object Detectors
Domain Adaptation for Large-Vocabulary Object Detectors
Kai Jiang
Jiaxing Huang
Weiying Xie
Jie Lei
Yunsong Li
Ling Shao
Shijian Lu
ObjD
VLM
40
2
0
13 Jan 2024
123
Next