Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.03605
Cited By
v1
v2
v3
v4 (latest)
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
7 March 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2506★)
Papers citing
"DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
50 / 742 papers shown
Title
SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection
Philipp Wolters
Johannes Gilg
Torben Teepe
Fabian Herzog
Felix Fent
Gerhard Rigoll
135
0
0
29 Nov 2024
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Jinyuan Qu
Hongyang Li
Shilong Liu
Tianhe Ren
Zhaoyang Zeng
Lei Zhang
3DPC
138
1
0
27 Nov 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
216
10
0
27 Nov 2024
Exploring Aleatoric Uncertainty in Object Detection via Vision Foundation Models
Peng Cui
Guande He
Dan Zhang
Zhijie Deng
Yinpeng Dong
Jun Zhu
177
1
0
26 Nov 2024
Open Vocabulary Monocular 3D Object Detection
Jin Yao
Hao Gu
Xuweiyi Chen
Jiayun Wang
Zezhou Cheng
ObjD
VLM
121
3
0
25 Nov 2024
Edge Weight Prediction For Category-Agnostic Pose Estimation
Or Hirschorn
S. Avidan
150
0
0
25 Nov 2024
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng
Yijiang Li
Wanpeng Zhang
Sipeng Zheng
Zongqing Lu
Sipeng Zheng
Zongqing Lu
171
2
0
25 Nov 2024
DT-LSD: Deformable Transformer-based Line Segment Detection
Sebastian Janampa
Marios Pattichis
ViT
158
1
0
20 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
232
1
0
19 Nov 2024
WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images
Lars Nieradzik
Henrike Stephani
Jördis Sieburg-Rockel
Stephanie Helmling
Andrea Olbrich
Stephanie Wrage
J. Keuper
112
0
0
18 Nov 2024
EVT: Efficient View Transformation for Multi-Modal 3D Object Detection
Yongjin Lee
Hyeon-Mun Jeong
Yurim Jeon
Sanghyun Kim
141
0
0
16 Nov 2024
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning
Jingru Yang
Huan Yu
Yang Jingxin
C. Xu
Yin Biao
Yu Sun
Shengfeng He
58
1
0
15 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
180
0
0
12 Nov 2024
White-Box Diffusion Transformer for single-cell RNA-seq generation
Zhuorui Cui
Shengze Dong
Ding Liu
40
1
0
11 Nov 2024
Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection
Yifan Wang
Xiaohu Yang
Fanqi Pu
Q. Liao
Wenming Yang
76
0
0
05 Nov 2024
GigaCheck: Detecting LLM-generated Content
Irina Tolstykh
Aleksandra Tsybina
Sergey Yakubson
Aleksandr Gordeev
Vladimir Dokholyan
Maksim Kuprashevich
DeLMO
86
2
0
31 Oct 2024
Unbiased Regression Loss for DETRs
Edric
Ueta Daisuke
Kurokawa Yukimasa
Karlekar Jayashree
Sugiri Pranata
66
0
0
30 Oct 2024
Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
VLM
111
3
0
25 Oct 2024
On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes
Rajat Modi
Vibhav Vineet
Yogesh S Rawat
86
2
0
25 Oct 2024
DREB-Net: Dual-stream Restoration Embedding Blur-feature Fusion Network for High-mobility UAV Object Detection
Qingpeng Li
Yuxin Zhang
Leyuan Fang
Yuhan Kang
Shutao Li
Xiao Xiang Zhu
71
1
0
23 Oct 2024
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
Xiaoxuan Ma
Yutang Lin
Yuan Xu
Stephan P. Kaufhold
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
122
0
0
22 Oct 2024
DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model
Zhixiong Nan
Xianghong Li
Tao Xiang
Jifeng Dai
ISeg
101
1
0
22 Oct 2024
PlaneSAM: Multimodal Plane Instance Segmentation Using the Segment Anything Model
Zhongchen Deng
Zhechen Yang
Chi Chen
Cheng Zeng
Yan Meng
Bisheng Yang
53
1
0
21 Oct 2024
Adventures with Grace Hopper AI Super Chip and the National Research Platform
J. Alex Hurt
Grant J. Scott
Derek Weitzel
Huijun Zhu
26
1
0
21 Oct 2024
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts
Xumeng Han
Longhui Wei
Zhiyang Dou
Zipeng Wang
Chenhui Qiang
Xin He
Yingfei Sun
Zhenjun Han
Qi Tian
MoE
85
5
0
21 Oct 2024
Triplane Grasping: Efficient 6-DoF Grasping with Single RGB Images
Yiming Li
Hanchi Ren
Yue Yang
Jingjing Deng
Xianghua Xie
116
0
0
21 Oct 2024
A Survey of Hallucination in Large Visual Language Models
Wei Lan
Wenyi Chen
Qingfeng Chen
Shirui Pan
Huiyu Zhou
Yi-Lun Pan
LRM
92
6
0
20 Oct 2024
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability
Yusuke Hosoya
Masanori Suganuma
Takayuki Okatani
ObjD
92
0
0
20 Oct 2024
D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement
Yansong Peng
Hebei Li
Peixi Wu
Yueyi Zhang
Xingwu Sun
Feng Wu
99
17
0
17 Oct 2024
OAH-Net: A Deep Neural Network for Hologram Reconstruction of Off-axis Digital Holographic Microscope
Wei Liu
Kerem Delikoyun
Qianyu Chen
Alperen Yildiz
Si Ko Myo
Win Sen Kuan
John Tshon Yit Soong
Matthew Edward Cove
Oliver Hayden
Hweekuan Lee
60
0
0
17 Oct 2024
VividMed: Vision Language Model with Versatile Visual Grounding for Medicine
Lingxiao Luo
Bingda Tang
Xuanzhong Chen
Rong Han
Ting Chen
VLM
93
3
0
16 Oct 2024
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Zhiyuan Zhao
Hengrui Kang
Bin Wang
Zeang Sheng
69
17
0
16 Oct 2024
big.LITTLE Vision Transformer for Efficient Visual Recognition
He Guo
Yulong Wang
Zixuan Ye
Jifeng Dai
Yuwen Xiong
ViT
94
0
0
14 Oct 2024
ET-Former: Efficient Triplane Deformable Attention for 3D Semantic Scene Completion From Monocular Camera
Jing Liang
He Yin
Xuewei Qi
Jong Jin Park
Min Sun
R. Madhivanan
Dinesh Manocha
3DPC
127
0
0
14 Oct 2024
UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation
Ye Sun
Hao Zhang
Tiehua Zhang
Xingjun Ma
Yu-Gang Jiang
VLM
89
4
0
13 Oct 2024
Ego3DT: Tracking Every 3D Object in Ego-centric Videos
Shengyu Hao
Wenhao Chai
Zhonghan Zhao
Meiqi Sun
Wendi Hu
...
Yixian Zhao
Qi Li
Yizhou Wang
Xi Li
Gaoang Wang
87
3
0
11 Oct 2024
Multi-Scale Deformable Transformers for Student Learning Behavior Detection in Smart Classroom
Zhifeng Wang
Minghui Wang
Chunyan Zeng
Longlong Li
58
1
0
10 Oct 2024
Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis
Ahmed Abdullah
Nikolas Ebert
Oliver Wasenmüller
ObjD
60
1
0
09 Oct 2024
Improving Object Detection via Local-global Contrastive Learning
Danai Triantafyllidou
Sarah Parisot
A. Leonardis
Jingyu Sun
111
2
0
07 Oct 2024
Cross Resolution Encoding-Decoding For Detection Transformers
Ashish Kumar
Jaesik Park
ViT
96
0
0
05 Oct 2024
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection
Yang Cao
Yuanliang Jv
Dan Xu
3DGS
65
3
0
02 Oct 2024
Saliency-Guided DETR for Moment Retrieval and Highlight Detection
Aleksandr Gordeev
Vladimir Dokholyan
Irina Tolstykh
Maksim Kuprashevich
77
7
0
02 Oct 2024
Pose Estimation of Buried Deep-Sea Objects using 3D Vision Deep Learning Models
Jerry Yan
Chinmay Talegaonkar
Nicholas Antipa
Eric Terrill
Sophia Merrifield
3DPC
38
1
0
01 Oct 2024
Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation
Aleyna Kütük
Tevfik Metin Sezgin
116
2
0
30 Sep 2024
Intelligent Fish Detection System with Similarity-Aware Transformer
Shengchen Li
Haobo Zuo
Changhong Fu
Zhiyong Wang
Zhiqiang Xu
ViT
95
0
0
28 Sep 2024
Embed and Emulate: Contrastive representations for simulation-based inference
Ruoxi Jiang
Peter Y. Lu
Rebecca Willett
63
1
0
27 Sep 2024
You Only Speak Once to See
Wenhao Yang
Jianguo Wei
Wenhuan Lu
Lei Li
VOS
61
2
0
27 Sep 2024
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Zhangpu Li
Changhong Zou
Suxue Ma
Zhicheng Yang
Chen Du
...
Xingzhi Sun
Jing Xiao
Kai Zhang
Mei Han
Mei Han
LM&MA
98
1
0
26 Sep 2024
Source-Free Domain Adaptation for YOLO Object Detection
Simon Varailhon
Masih Aminbeidokhti
M. Pedersoli
Eric Granger
TTA
ObjD
90
4
0
25 Sep 2024
General Detection-based Text Line Recognition
Raphael Baena
Syrine Kalleli
Mathieu Aubry
448
0
0
25 Sep 2024
Previous
1
2
3
4
5
...
13
14
15
Next