Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.04159
Cited By
v1
v2
v3
v4 (latest)
Deformable DETR: Deformable Transformers for End-to-End Object Detection
8 October 2020
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3553★)
Papers citing
"Deformable DETR: Deformable Transformers for End-to-End Object Detection"
50 / 2,533 papers shown
Title
End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents
Iqraa Ehsan
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
LMTD
63
5
0
08 May 2024
ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers
Jinke Li
Xiao He
Chonghua Zhou
Xiaoqiang Cheng
Yang Wen
Dan Zhang
ViT
82
16
0
07 May 2024
Deep Event-based Object Detection in Autonomous Driving: A Survey
Bin Zhou
Jie Jiang
95
0
0
07 May 2024
S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling
Minh-Triet Tran
Adrian de Luis
Haitao Liao
Ying Huang
Roy McCann
Alan Mantooth
Jack Cothren
Ngan Le
258
0
0
07 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
174
48
0
06 May 2024
Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods
Chenlin Zhou
Han Zhang
Liutao Yu
Yumin Ye
Zhaokun Zhou
Liwei Huang
Zhengyu Ma
Xiaopeng Fan
Huihui Zhou
Yonghong Tian
116
13
0
06 May 2024
Enhancing DETRs Variants through Improved Content Query and Similar Query Aggregation
Yingying Zhang
Chuangji Shi
Xin Guo
Jiangwei Lao
Jian Wang
Jiaotuan Wang
Jingdong Chen
81
3
0
06 May 2024
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook
Yanan Zhang
Jinqing Zhang
Zengran Wang
Junhao Xu
Di Huang
77
18
0
04 May 2024
ViTALS: Vision Transformer for Action Localization in Surgical Nephrectomy
Soumyadeep Chandra
Sayeed Shafayet Chowdhury
Courtney Yong
Chandru P. Sundaram
Kaushik Roy
56
0
0
04 May 2024
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
95
11
0
02 May 2024
Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion
Shanshan Zhang
Mingqian Ji
Yang Li
Jian Yang
88
1
0
02 May 2024
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Xiaoshi Wu
Yiming Hao
Manyuan Zhang
Keqiang Sun
Zhaoyang Huang
Guanglu Song
Yu Liu
Hongsheng Li
EGVM
127
25
0
01 May 2024
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
Dayou Du
Gu Gong
Xiaowen Chu
MQ
140
8
0
01 May 2024
Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer
Tahira Shehzadi
Shalini Sarode
Didier Stricker
Muhammad Zeshan Afzal
LMTD
108
4
0
30 Apr 2024
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
Yuliang Liu
Mingxin Huang
Hao Yan
Linger Deng
Weijia Wu
Hao Lu
Chunhua Shen
Lianwen Jin
Xiang Bai
86
0
0
30 Apr 2024
Reliable or Deceptive? Investigating Gated Features for Smooth Visual Explanations in CNNs
Soham Mitra
Atri Sukul
Swalpa Kumar Roy
Pravendra Singh
Vinay Kumar Verma
AAML
FAtt
55
0
0
30 Apr 2024
Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank
Sungjune Park
Hyunjun Kim
Y. Ro
90
13
0
30 Apr 2024
C2FDrone: Coarse-to-Fine Drone-to-Drone Detection using Vision Transformer Networks
Sairam VC Rebbapragada
Pranoy Panda
Vineeth N. Balasubramanian
ViT
97
5
0
30 Apr 2024
Dexterous Grasp Transformer
Guo-Hao Xu
Yi-Lin Wei
Dian Zheng
Xiao-Ming Wu
Wei-Shi Zheng
ViT
76
13
0
28 Apr 2024
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
69
5
0
27 Apr 2024
Efficient Bi-manipulation using RGBD Multi-model Fusion based on Attention Mechanism
Jian Shen
Jiaxin Huang
Zhigong Song
28
0
0
27 Apr 2024
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Zhenghong Li
Jiaxiang Ren
Wensheng Cheng
C. Du
Yingtian Pan
Haibin Ling
66
0
0
26 Apr 2024
UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning
Maoxun Yuan
Bo Cui
Tianyi Zhao
Xingxing Wei
Shan Fu
Xue Yang
Xingxing Wei
112
0
0
26 Apr 2024
Features Fusion for Dual-View Mammography Mass Detection
Arina Varlamova
Valery Belotsky
Grigory Novikov
Anton Konushin
Evgeny Sidorov
MedIm
37
1
0
25 Apr 2024
Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Haotian Yan
Ming Wu
Chuang Zhang
100
14
0
25 Apr 2024
BezierFormer: A Unified Architecture for 2D and 3D Lane Detection
Zhiwei Dong
Xi Zhu
Xiya Cao
Ran Ding
Wei Li
Caifa Zhou
Yongliang Wang
Qiangbo Liu
108
3
0
25 Apr 2024
ChEX: Interactive Localization and Region Description in Chest X-rays
Philip Muller
Georgios Kaissis
Daniel Rueckert
88
5
0
24 Apr 2024
SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
Xiang Gao
Yuqi Zhang
GAN
70
0
0
24 Apr 2024
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang
Zhongdao Wang
Pin Tang
Jilai Zheng
Xiangxuan Ren
Bailan Feng
Chao Ma
DiffM
98
19
0
23 Apr 2024
DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models
Jieru Lin
Danqing Huang
Tiejun Zhao
Dechen Zhan
Chin-Yew Lin
VLM
MLLM
62
3
0
23 Apr 2024
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Abhishek Aich
Yumin Suh
S. Schulter
Manmohan Chandraker
162
0
0
23 Apr 2024
PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Zhangjing Yang
Dun Liu
Wensheng Cheng
Jinqiao Wang
Yi Wu
VLM
65
2
0
22 Apr 2024
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Chuofan Ma
Yi Jiang
Jiannan Wu
Zehuan Yuan
Xiaojuan Qi
VLM
ObjD
113
65
0
19 Apr 2024
FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
Xingtai Gui
Tengteng Huang
Haonan Shao
Haotian Yao
Chi Zhang
77
4
0
19 Apr 2024
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery
Yona Falinie A. Gaus
Neelanjan Bhowmik
Brian K. S. Isaac-Medina
T. Breckon
VLM
80
2
0
18 Apr 2024
MLS-Track: Multilevel Semantic Interaction in RMOT
Zeliang Ma
Yang Song
Zhe Cui
Zhicheng Zhao
Fei Su
Delong Liu
Jingyu Wang
76
3
0
18 Apr 2024
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Qiyuan Dai
Sibei Yang
86
9
0
18 Apr 2024
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Song Wang
Jiawei Yu
Wentong Li
Wenyu Liu
Xiaolu Liu
Junbo Chen
Jianke Zhu
122
22
0
18 Apr 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
124
3
0
18 Apr 2024
TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation
T. Monninger
Vandana Dokkadi
Md Zafar Anwar
Steffen Staab
65
2
0
17 Apr 2024
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
Mir Rayat Imtiaz Hossain
Mennatullah Siam
Leonid Sigal
James J. Little
VLM
100
7
0
17 Apr 2024
Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems
Luca Bompani
Manuele Rusci
Daniele Palossi
Francesco Conti
Luca Benini
MQ
72
0
0
17 Apr 2024
CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect
Minh Q. Tran
Sang Truong
Arthur F. A. Fernandes
Michael Kidd
Ngan Le
ViT
100
4
0
17 Apr 2024
HybriMap: Hybrid Clues Utilization for Effective Vectorized HD Map Construction
Chi Zhang
Qi Song
Feifei Li
Yongquan Chen
Rui Huang
75
2
0
17 Apr 2024
OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery
Matthew J. Inkawhich
Nathan Inkawhich
Hao Yang
Jingyang Zhang
Randolph Linderman
Yiran Chen
ObjD
103
0
0
16 Apr 2024
No More Ambiguity in 360° Room Layout via Bi-Layout Estimation
Yu-Ju Tsai
Jin-Cheng Jhang
Jingjing Zheng
Wei Wang
Albert Y. C. Chen
Min Sun
Cheng-Hao Kuo
Ming-Hsuan Yang
3DV
70
4
0
15 Apr 2024
Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition
Masato Tamura
61
2
0
15 Apr 2024
STMixer: A One-Stage Sparse Action Detector
Tao Wu
Mengqing Cao
Ziteng Gao
Gangshan Wu
Limin Wang
83
0
0
15 Apr 2024
SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction
Pin Tang
Zhongdao Wang
Guoqing Wang
Jilai Zheng
Xiangxuan Ren
Bailan Feng
Chao Ma
91
44
0
15 Apr 2024
Q2A: Querying Implicit Fully Continuous Feature Pyramid to Align Features for Medical Image Segmentation
Jiahao Yu
Li Chen
113
0
0
15 Apr 2024
Previous
1
2
3
...
12
13
14
...
49
50
51
Next