Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.03605
Cited By
v1
v2
v3
v4 (latest)
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
7 March 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2506★)
Papers citing
"DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
50 / 742 papers shown
Title
Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object Tracking
Feng Yan
Weihua Luo
Yujie Zhong
Yiyang Gan
Lin Ma
VOT
110
18
0
22 May 2023
Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model
Jie Yang
Bing Li
Fengyu Yang
Ailing Zeng
Lei Zhang
Ruimao Zhang
VLM
DiffM
118
17
0
20 May 2023
Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer
Hakjin Lee
Minki Song
Jamyoung Koo
Junghoon Seo
116
8
0
12 May 2023
Segment and Track Anything
Yangming Cheng
Liulei Li
Yuanyou Xu
Xiaodi Li
Zongxin Yang
Wenguan Wang
Yi Yang
VOS
98
205
0
11 May 2023
WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Mingliang Zhang
Zhen Cao
Juntao Liu
Liqiang Niu
Fandong Meng
Jie Zhou
70
7
0
11 May 2023
Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving
Xiaosong Jia
Peng Wu
Li Chen
Jiangwei Xie
Conghui He
Junchi Yan
Hongyang Li
96
114
0
10 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
ViT
90
16
0
08 May 2023
Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN
YuXuan Liu
Nikhil Mishra
Pieter Abbeel
Xi Chen
ISeg
UQCV
116
4
0
03 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
117
35
0
29 Apr 2023
A Strong and Reproducible Object Detector with Only Public Datasets
Tianhe Ren
Jianwei Yang
Siyi Liu
Ailing Zeng
Feng Li
Hao Zhang
Hongyang Li
Zhaoyang Zeng
Lei Zhang
ObjD
80
11
0
25 Apr 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
90
14
0
24 Apr 2023
OmniLabel: A Challenging Benchmark for Language-Based Object Detection
S. Schulter
G. VijayKumarB.
Yumin Suh
Konstantinos M. Dafnis
Zhixing Zhang
Shiyu Zhao
Dimitris N. Metaxas
ObjD
75
12
0
22 Apr 2023
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
Xianbiao Qi
Jianan Wang
Yihao Chen
Yukai Shi
Lei Zhang
98
21
0
19 Apr 2023
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
170
147
0
19 Apr 2023
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous System
Wendong Zhang
46
0
0
19 Apr 2023
DETRs Beat YOLOs on Real-time Object Detection
Yian Zhao
Wenyu Lv
Shangliang Xu
Jinman Wei
Guanzhong Wang
Qingqing Dang
Yi Liu
Cheng Cui
113
1,015
0
17 Apr 2023
Permutation Equivariance of Transformers and Its Applications
Hengyuan Xu
Liyao Xiang
Hang Ye
Dixi Yao
Pengzhi Chu
Baochun Li
56
15
0
16 Apr 2023
Align-DETR: Improving DETR with Simple IoU-aware BCE loss
Zhi Cai
Songtao Liu
Guodong Wang
Zheng Ge
Xiangyu Zhang
Di Huang
95
4
0
15 Apr 2023
CornerFormer: Boosting Corner Representation for Fine-Grained Structured Reconstruction
Hongbo Tian
Yulong Li
Linzhi Huang
Xu Ling
Yue Yang
Weihong Deng
3DV
74
0
0
14 Apr 2023
Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
Yifeng Shi
Feng Lv
Xinliang Wang
Chunlong Xia
Shaojie Li
Shu-Zhen Yang
Teng Xi
Gang Zhang
VLM
156
13
0
12 Apr 2023
StageInteractor: Query-based Object Detector with Cross-stage Interaction
Yao Teng
Haisong Liu
Sheng Guo
Limin Wang
ObjD
89
8
0
11 Apr 2023
Detection Transformer with Stable Matching
Siyi Liu
Tianhe Ren
Jia-Yu Chen
Zhaoyang Zeng
Hao Zhang
...
Hongyang Li
Jun Huang
Hang Su
Jun Zhu
Lei Zhang
80
36
0
10 Apr 2023
StillFast: An End-to-End Approach for Short-Term Object Interaction Anticipation
Francesco Ragusa
G. Farinella
Antonino Furnari
77
18
0
08 Apr 2023
V3Det: Vast Vocabulary Visual Detection Dataset
Jiaqi Wang
Pan Zhang
Tao Chu
Yuhang Cao
Yujie Zhou
Tong Wu
Bin Wang
Conghui He
Dahua Lin
VLM
ObjD
119
55
0
07 Apr 2023
Language-aware Multiple Datasets Detection Pretraining for DETRs
Jing Hao
Song Chen
Xiaodi Wang
Shumin Han
ObjD
82
3
0
07 Apr 2023
Continual Detection Transformer for Incremental Object Detection
Yaoyao Liu
Bernt Schiele
Andrea Vedaldi
Christian Rupprecht
CLL
83
56
0
06 Apr 2023
Boundary-Denoising for Video Activity Localization
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
70
10
0
06 Apr 2023
VoxelFormer: Bird's-Eye-View Feature Generation based on Dual-view Attention for Multi-view 3D Object Detection
Zhuoling Li
Chuanrui Zhang
Wei-Chiu Ma
Yipin Zhou
Linyan Huang
Haoqian Wang
SerNam Lim
Hengshuang Zhao
62
6
0
03 Apr 2023
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction
Zhuofan Zong
Dong Jiang
Guanglu Song
Zeyue Xue
Jingyong Su
Hongsheng Li
Yu Liu
118
39
0
03 Apr 2023
Siamese DETR
Ze-Sen Chen
Gengshi Huang
Wei Li
Jianing Teng
Kun Wang
Jing Shao
Chen Change Loy
Lu Sheng
ViT
82
9
0
31 Mar 2023
DDP: Diffusion Model for Dense Visual Prediction
Yuanfeng Ji
Zhe Chen
Enze Xie
Lanqing Hong
Xihui Liu
Zhaoqiang Liu
Tong Lu
Zhenguo Li
Ping Luo
DiffM
VLM
133
138
0
30 Mar 2023
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
ViT
157
98
0
27 Mar 2023
ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang
Xing-Hui Wang
Xiaoqing Ye
Wei Zhang
Jincheng Lu
Xiao Tan
Errui Ding
Pei Sun
Jingdong Wang
VOT
88
23
0
27 Mar 2023
The effectiveness of MAE pre-pretraining for billion-scale pretraining
Mannat Singh
Quentin Duval
Kalyan Vasudev Alwala
Haoqi Fan
Vaibhav Aggarwal
...
Piotr Dollár
Christoph Feichtenhofer
Ross B. Girshick
Rohit Girdhar
Ishan Misra
LRM
182
71
0
23 Mar 2023
Dense Distinct Query for End-to-End Object Detection
Shilong Zhang
Wang xinjiang
Jiaqi Wang
Jiangmiao Pang
Chengqi Lyu
Wenwei Zhang
Ping Luo
Kai-xiang Chen
132
135
0
22 Mar 2023
LiDARFormer: A Unified Transformer-based Multi-task Network for LiDAR Perception
Zixiang Zhou
Dongqiangzi Ye
Weijia Chen
Yufei Xie
Yu Wang
Panqu Wang
H. Foroosh
65
10
0
21 Mar 2023
Detecting Everything in the Open World: Towards Universal Object Detection
Zhenyu Wang
Yali Li
Xi Chen
Ser-Nam Lim
Antonio Torralba
Hengshuang Zhao
Shengjin Wang
ObjD
VLM
82
79
0
21 Mar 2023
Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer
Jiawei Wang
Weihong Lin
Chixiang Ma
Mingze Li
Zhengmao Sun
Lei-huan Sun
Qiang Huo
LMTD
129
18
0
21 Mar 2023
EVA-02: A Visual Representation for Neon Genesis
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
127
289
0
20 Mar 2023
CCTV-Gun: Benchmarking Handgun Detection in CCTV Images
Srikar Yellapragada
Zhenghong Li
K. Doshi
Purva Mhasakar
Heng Fan
Jieda Wei
Erik P. Blasch
Bin Zhang
Haibin Ling
59
6
0
19 Mar 2023
CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
Kaixin Xiong
Shi Gong
Xiaoqing Ye
Xiao Tan
Ji Wan
Errui Ding
Jingdong Wang
Xiang Bai
3DPC
75
36
0
17 Mar 2023
DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Jiawei Ma
Yulei Niu
Jincheng Xu
Shiyuan Huang
G. Han
Shih-Fu Chang
ObjD
86
37
0
16 Mar 2023
FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors
Yiming Cui
Linjie Yang
ViT
83
15
0
15 Mar 2023
A Simple Framework for Open-Vocabulary Segmentation and Detection
Hao Zhang
Feng Li
Xueyan Zou
Siyi Liu
Chun-yue Li
Jianfeng Gao
Jianwei Yang
Lei Zhang
ObjD
VLM
93
162
0
14 Mar 2023
MP-Former: Mask-Piloted Transformer for Image Segmentation
Hao Zhang
Feng Li
Hu-Sheng Xu
Shijia Huang
Siyi Liu
L. Ni
Lei Zhang
ViT
MedIm
113
60
0
13 Mar 2023
Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
Feng Li
Ailing Zeng
Siyi Liu
Hao Zhang
Hongyang Li
Lei Zhang
L. Ni
ViT
89
71
0
13 Mar 2023
Object-Centric Multi-Task Learning for Human Instances
Hyeongseok Son
Sang-Il Jung
Solae Lee
Seong-heum Kim
Seungsang Park
ByungIn Yoo
3DH
128
0
0
13 Mar 2023
Universal Instance Perception as Object Discovery and Retrieval
B. Yan
Yi Jiang
Jiannan Wu
D. Wang
Ping Luo
Zehuan Yuan
Huchuan Lu
VOS
VLM
LRM
148
176
0
12 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
206
2,037
0
09 Mar 2023
Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking
Peng Gao
Renrui Zhang
Rongyao Fang
Ziyi Lin
Hongyang Li
Hongsheng Li
Qiao Yu
65
19
0
09 Mar 2023
Previous
1
2
3
...
12
13
14
15
Next