Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.03605
Cited By
v1
v2
v3
v4 (latest)
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
7 March 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2506★)
Papers citing
"DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
50 / 742 papers shown
Title
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection
Joonhyun Jeong
Geondo Park
Jayeon Yoo
Hyungsik Jung
Heesu Kim
VLM
ObjD
92
11
0
12 Dec 2023
OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection
Hu Zhang
Jianhua Xu
Tao Tang
Haiyang Sun
Xin Yu
Zi Huang
Kaicheng Yu
ObjD
3DPC
81
12
0
12 Dec 2023
Mixed Pseudo Labels for Semi-Supervised Object Detection
Ze-Yi Chen
Wenwei Zhang
Xinjiang Wang
Kai Chen
Zhi Wang
ObjD
81
10
0
12 Dec 2023
A Multimodal Dataset and Benchmark for Radio Galaxy and Infrared Host Detection
N. Gupta
Zeeshan Hayder
Ray P. Norris
Minh Huynh
Lars Petersson
18
3
0
11 Dec 2023
MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation
Abdullah Rashwan
Jiageng Zhang
A. Taalimi
Fan Yang
Xingyi Zhou
Chaochao Yan
Liang-Chieh Chen
Yeqing Li
ViT
114
5
0
11 Dec 2023
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception
Sheng Jin
Shuhuai Li
Tong Li
Wentao Liu
Chao Qian
Ping Luo
115
5
0
09 Dec 2023
Vision-based Learning for Drones: A Survey
Jiaping Xiao
Rangya Zhang
Yuhang Zhang
Mir Feroskhan
69
5
0
08 Dec 2023
Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects
Junyu Lu
Ruyi Gan
Di Zhang
Xiaojun Wu
Ziwei Wu
Renliang Sun
Jiaxing Zhang
Pingjian Zhang
Yan Song
MLLM
VLM
96
17
0
08 Dec 2023
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Hao Zhang
Hongyang Li
Feng Li
Tianhe Ren
Xueyan Zou
...
Shijia Huang
Jianfeng Gao
Lei Zhang
Chun-yue Li
Jianwei Yang
189
76
0
05 Dec 2023
Lenna: Language Enhanced Reasoning Detection Assistant
Fei Wei
Xinyu Zhang
Ailing Zhang
Bo Zhang
Xiangxiang Chu
MLLM
LRM
99
25
0
05 Dec 2023
MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation
Fenghe Tang
Bingkun Nian
Jianrui Ding
Quan Quan
Jie Yang
Wei Liu
S.Kevin Zhou
ViT
MedIm
105
3
0
04 Dec 2023
Learning Efficient Unsupervised Satellite Image-based Building Damage Detection
Yiyun Zhang
Zijian Wang
Yadan Luo
Xin Yu
Zi Huang
48
4
0
04 Dec 2023
DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
Uy Dieu Tran
Minh Luu
P. Nguyen
K. Nguyen
Binh-Son Hua
98
1
0
02 Dec 2023
Segment and Caption Anything
Xiaoke Huang
Jianfeng Wang
Yansong Tang
Zheng Zhang
Han Hu
Jiwen Lu
Lijuan Wang
Zicheng Liu
MLLM
VLM
92
21
0
01 Dec 2023
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos
Pilhyeon Lee
Hyeran Byun
79
11
0
30 Nov 2023
Language-conditioned Detection Transformer
Jang Hyun Cho
Philipp Krahenbuhl
VLM
ObjD
95
1
0
29 Nov 2023
A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn
S. Avidan
128
13
0
29 Nov 2023
PViT-6D: Overclocking Vision Transformers for 6D Pose Estimation with Confidence-Level Prediction and Pose Tokens
Sebastian Stapf
Tobias Bauernfeind
Marco Riboldi
ViT
58
1
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
93
98
0
28 Nov 2023
Stable Segment Anything Model
Qi Fan
Xin Tao
Lei Ke
Mingqiao Ye
Yuanhui Zhang
Pengfei Wan
Zhong-ming Wang
Yu-Wing Tai
Chi-Keung Tang
VLM
85
6
0
27 Nov 2023
Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
Yufei Zhan
Yousong Zhu
Zhiyang Chen
Fan Yang
E. Goles
Jinqiao Wang
ObjD
114
17
0
24 Nov 2023
OneFormer3D: One Transformer for Unified Point Cloud Segmentation
Maksim Kolodiazhnyi
Anna Vorontsova
Anton Konushin
D. Rukhovich
ViT
96
52
0
24 Nov 2023
The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024
Benjamin Kiefer
Lojze Žust
Matej Kristan
J. Pers
Matija Tersek
...
Magdalena Šumunec
Nadir Kapetanović
A. Michel
Wolfgang Gross
Martin Weinmann
62
4
0
23 Nov 2023
T-Rex: Counting by Visual Prompting
Qing Jiang
Feng Li
Tianhe Ren
Shilong Liu
Zhaoyang Zeng
Kent Yu
Lei Zhang
100
14
0
22 Nov 2023
Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting
David Latortue
Moetez Kdayem
F. Guerrero-Peña
Eric Granger
M. Pedersoli
76
0
0
20 Nov 2023
SniffyArt: The Dataset of Smelling Persons
Mathias Zinnen
Azhar Hussian
Hang Tran
Prathmesh Madhu
Andreas Maier
Vincent Christlein
70
9
0
20 Nov 2023
Sparse4D v3: Advancing End-to-End 3D Detection and Tracking
Xuewu Lin
Zi-Hui Pei
Tianwei Lin
Lichao Huang
Zhizhong Su
97
38
0
20 Nov 2023
FreeKD: Knowledge Distillation via Semantic Frequency Prompt
Yuan Zhang
Tao Huang
Jiaming Liu
Tao Jiang
Kuan Cheng
Shanghang Zhang
AAML
89
16
0
20 Nov 2023
Decoupled DETR For Few-shot Object Detection
Zeyu Shangguan
Lian Huai
Tong Liu
Xingqun Jiang
105
2
0
20 Nov 2023
Contrastive Learning for Multi-Object Tracking with Transformers
Pierre-François De Plaen
Nicola Marinello
Marc Proesmans
Tinne Tuytelaars
Luc Van Gool
VOT
106
7
0
14 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
127
175
0
10 Nov 2023
DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets
Yash Jain
Harkirat Singh Behl
Z. Kira
Vibhav Vineet
69
14
0
08 Nov 2023
Autonomous Advanced Aerial Mobility -- An End-to-end Autonomy Framework for UAVs and Beyond
Sakshi Mishra
Praveen Palanisamy
92
16
0
08 Nov 2023
Cal-DETR: Calibrated Detection Transformer
Muhammad Akhtar Munir
Salman Khan
Muhammad Haris Khan
Mohsen Ali
Fahad Shahbaz Khan
90
9
0
06 Nov 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
130
5
0
02 Nov 2023
A High-Resolution Dataset for Instance Detection with Multi-View Instance Capture
Qianqian Shen
Yunhan Zhao
Nahyun Kwon
Jeeeun Kim
Yanan Li
Shu Kong
54
2
0
30 Oct 2023
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models
Tsun-Hsuan Wang
Alaa Maalouf
Wei Xiao
Yutong Ban
Alexander Amini
Guy Rosman
S. Karaman
Daniela Rus
73
46
0
26 Oct 2023
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing
Chau Pham
Truong Vu
Khoi Duc Minh Nguyen
ObjD
95
17
0
26 Oct 2023
Prompt-Driven Building Footprint Extraction in Aerial Images with Offset-Building Model
Kai Li
Yupeng Deng
Yun-long Kong
Diyou Liu
Jingbo Chen
Yu Meng
Junxian Ma
Chenhao Wang
256
1
0
25 Oct 2023
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection
Manyuan Zhang
Guanglu Song
Yu Liu
Hongsheng Li
95
14
0
24 Oct 2023
Detrive: Imitation Learning with Transformer Detection for End-to-End Autonomous Driving
Dao-zheng Chen
Ning Wang
Feng Chen
A. Pipe
ViT
64
4
0
22 Oct 2023
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection
Junjia Huang
Haofeng Li
Xiang Wan
Guanbin Li
MedIm
ViT
85
12
0
22 Oct 2023
Zone Evaluation: Revealing Spatial Bias in Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ping Wang
Ming-Ming Cheng
ObjD
116
4
0
20 Oct 2023
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Lingchen Meng
Xiyang Dai
Jianwei Yang
Dongdong Chen
Yinpeng Chen
Mengchen Liu
Yi-Ling Chen
Zuxuan Wu
Lu Yuan
Yu-Gang Jiang
74
7
0
18 Oct 2023
Rank-DETR for High Quality Object Detection
Yifan Pu
Weicong Liang
Yiduo Hao
Yuhui Yuan
Yukang Yang
Chao Zhang
Hanhua Hu
Gao Huang
101
61
0
13 Oct 2023
X-Pose: Detecting Any Keypoints
Jie Yang
Ailing Zeng
Ruimao Zhang
Lei Zhang
93
7
0
12 Oct 2023
Uni3DETR: Unified 3D Detection Transformer
Zhenyu Wang
Yali Li
Xi Chen
Hengshuang Zhao
Shengjin Wang
3DPC
94
22
0
09 Oct 2023
Visual inspection for illicit items in X-ray images using Deep Learning
Ioannis Mademlis
Georgios Batsis
Adamantia Anna Rebolledo Chrysochoou
Georgios Th. Papadopoulos
52
4
0
05 Oct 2023
Toloka Visual Question Answering Benchmark
Mert Pilanci
Nikita Pavlichenko
Sergey Koshelev
Daniil Likhobaba
Alisa Smirnova
81
4
0
28 Sep 2023
Can the Query-based Object Detector Be Designed with Fewer Stages?
Jialin Li
Weifu Fu
Yu-Hsiang Lin
Qiang Nie
Yang Liu
ObjD
104
1
0
28 Sep 2023
Previous
1
2
3
...
9
10
11
...
13
14
15
Next