Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.04159
Cited By
v1
v2
v3
v4 (latest)
Deformable DETR: Deformable Transformers for End-to-End Object Detection
8 October 2020
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3553★)
Papers citing
"Deformable DETR: Deformable Transformers for End-to-End Object Detection"
50 / 2,533 papers shown
Title
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
176
5
0
18 Jul 2024
GroupMamba: Efficient Group-Based Visual State Space Model
Abdelrahman M. Shaker
Syed Talal Wasim
Salman Khan
Juergen Gall
Fahad Shahbaz Khan
Mamba
93
0
0
18 Jul 2024
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu
Jiaxin Chen
Hanwen Zhong
Di Huang
Yun Wang
MQ
118
12
0
17 Jul 2024
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
Leyang Shen
Gongwei Chen
Rui Shao
Weili Guan
Liqiang Nie
MoE
77
12
0
17 Jul 2024
Hierarchical and Decoupled BEV Perception Learning Framework for Autonomous Driving
Yuqi Dai
Jian Sun
Shengbo Eben Li
Qing Xu
Jianqiang Wang
Lei He
Keqiang Li
83
2
0
17 Jul 2024
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions
Seokha Moon
Hyun Woo
Hongbeen Park
Haeji Jung
R. Mahjourian
Hyung-Gun Chi
Hyerin Lim
Sangpil Kim
Jinkyu Kim
66
6
0
17 Jul 2024
Hierarchical Separable Video Transformer for Snapshot Compressive Imaging
Ping Wang
Yulun Zhang
Lishun Wang
Xin Yuan
ViT
110
2
0
16 Jul 2024
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection
Xiuquan Hou
Mei-qin Liu
Senlin Zhang
Ping Wei
Badong Chen
Xuguang Lan
ViT
100
17
0
16 Jul 2024
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection
Qijie Mo
Yipeng Gao
Shenghao Fu
Junkai Yan
Ancong Wu
Wei-Shi Zheng
CLL
93
7
0
16 Jul 2024
Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes
Zhi Cai
Yingjie Gao
Yaoyan Zheng
Nan Zhou
Di Huang
VLM
91
6
0
16 Jul 2024
CycleHOI: Improving Human-Object Interaction Detection with Cycle Consistency of Detection and Generation
Yisen Wang
Yao Teng
Limin Wang
DiffM
114
1
0
16 Jul 2024
Continuity Preserving Online CenterLine Graph Learning
Yunhui Han
Kun Yu
Zhiwei Li
GNN
3DPC
113
2
0
16 Jul 2024
TCFormer: Visual Recognition via Token Clustering Transformer
Wang Zeng
Sheng Jin
Lumin Xu
Wentao Liu
Chao Qian
Wanli Ouyang
Ping Luo
Xiaogang Wang
77
5
0
16 Jul 2024
Can Textual Semantics Mitigate Sounding Object Segmentation Preference?
Yaoting Wang
Peiwen Sun
Yuanchao Li
Honggang Zhang
Di Hu
102
5
0
15 Jul 2024
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Chunliang Li
Wencheng Han
Junbo Yin
Sanyuan Zhao
Jianbing Shen
81
4
0
15 Jul 2024
GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Haonan Wang
Jie Liu
Jie Tang
Gangshan Wu
Bo Xu
Y. Kevin Chou
Yong Wang
ViT
108
3
0
15 Jul 2024
SEED: A Simple and Effective 3D DETR in Point Clouds
Zhe Liu
Jinghua Hou
Xiaoqing Ye
Tong Wang
Jingdong Wang
Xiang Bai
3DPC
89
8
0
15 Jul 2024
Joint-Embedding Predictive Architecture for Self-Supervised Learning of Mask Classification Architecture
Donghee Kim
Sungduk Cho
Hyeonwoo Cho
Chanmin Park
Jinyoung Kim
Won Hwa Kim
96
0
0
15 Jul 2024
FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation
Honghao Xu
Juzhan Xu
Zeyu Huang
Pengfei Xu
Hui Huang
Ruizhen Hu
3DV
79
0
0
15 Jul 2024
PolyRoom: Room-aware Transformer for Floorplan Reconstruction
Yuzhou Liu
Lingjie Zhu
Xiaodong Ma
Hanqiao Ye
Xiang Gao
Xianwei Zheng
Shuhan Shen
65
1
0
15 Jul 2024
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang
Wang Zeng
Sheng Jin
Chao Qian
Ping Luo
Wentao Liu
75
6
0
14 Jul 2024
Plain-Det: A Plain Multi-Dataset Object Detector
Cheng Shi
Yuchen Zhu
Sibei Yang
ObjD
VLM
89
2
0
14 Jul 2024
MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection
Ziyue Huang
Yongchao Feng
Qingjie Liu
Yunhong Wang
ViT
125
1
0
13 Jul 2024
IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception
Shaohong Wang
Lu Bin
Xinyu Xiao
Zhiyu Xiang
Hangguan Shan
Eryun Liu
ViT
111
3
0
13 Jul 2024
Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation
Han Li
Shaohui Li
Shuangrui Ding
Wenrui Dai
Maida Cao
Chenglin Li
Junni Zou
Hongkai Xiong
VLM
106
8
0
13 Jul 2024
Neural-based Video Compression on Solar Dynamics Observatory Images
Atefeh Khoshkhahtinat
Ali Zafari
P. Mehta
Nasser M. Nasrabadi
Barbara J. Thompson
M. Kirk
D. D. Silva
117
0
0
12 Jul 2024
FD-SOS: Vision-Language Open-Set Detectors for Bone Fenestration and Dehiscence Detection from Intraoral Images
Marawan Elbatel
Keyuan Liu
Yanqi Yang
Xuelong Li
58
0
0
12 Jul 2024
Domain-adaptive Video Deblurring via Test-time Blurring
Jin-Ting He
Fu-Jen Tsai
Jia-Hao Wu
Yan-Tsung Peng
Chung-Chi Tsai
Chia-Wen Lin
Yen-Yu Lin
98
1
0
12 Jul 2024
DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects
Peng Wang
Yongcai Wang
Deying Li
VOT
83
3
0
12 Jul 2024
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak
Byeongju Woo
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
134
5
0
12 Jul 2024
Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness
Honghao Chen
Yurong Zhang
Xiaokun Feng
Xiangxiang Chu
Kaiqi Huang
AAML
81
6
0
12 Jul 2024
Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer
Tahira Shehzadi
Ifza
Didier Stricker
Muhammad Zeshan Afzal
ViT
103
0
0
11 Jul 2024
Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher
Jiangming Chen
Li Liu
Wanxia Deng
Zhen Liu
Yu Liu
Yingmei Wei
Yongxiang Liu
89
0
0
10 Jul 2024
DIOR-ViT: Differential Ordinal Learning Vision Transformer for Cancer Classification in Pathology Images
Ju Cheon Lee
Keunho Byeon
Boram Song
Kyungeun Kim
Jin Tae Kwak
MedIm
73
0
0
10 Jul 2024
Deformable-Heatmap-Segmentation for Automobile Visual Perception
Hongyu Jin
33
1
0
10 Jul 2024
ActionVOS: Actions as Prompts for Video Object Segmentation
Liangyang Ouyang
Ruicong Liu
Yifei Huang
Ryosuke Furuta
Yoichi Sato
VOS
79
2
0
10 Jul 2024
Exploring Camera Encoder Designs for Autonomous Driving Perception
Barath Lakshmanan
Joshua Chen
Shiyi Lan
Maying Shen
Zhiding Yu
Jose M. Alvarez
112
0
0
09 Jul 2024
D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
Tajamul Ashraf
K. Rangarajan
Mohit Gambhir
Richa Gabha
Chetan Arora
MedIm
116
2
0
09 Jul 2024
Anatomy-guided Pathology Segmentation
A. Jaus
C. Seibold
Simon Reiß
Lukas Heine
Anton Schily
Moon Kim
F. Bahnsen
Ken Herrmann
Rainer Stiefelhagen
Jens Kleesiek
MedIm
62
3
0
08 Jul 2024
Learning Lane Graphs from Aerial Imagery Using Transformers
Martin Büchner
Simon Dorer
Abhinav Valada
95
0
0
08 Jul 2024
Described Spatial-Temporal Video Detection
Wei Ji
Xiangyan Liu
Yingfei Sun
Jiajun Deng
You Qin
Ammar Nuwanna
Mengyao Qiu
Lina Wei
Roger Zimmermann
108
2
0
08 Jul 2024
Smart Camera Parking System With Auto Parking Spot Detection
Tuan T. Nguyen
Mina Sartipi
79
3
0
07 Jul 2024
JDT3D: Addressing the Gaps in LiDAR-Based Tracking-by-Attention
Brian Cheong
Jiachen Zhou
Steven Waslander
68
1
0
06 Jul 2024
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection
Zhiqiang Yang
Q. Guan
Keer Zhao
Jianmin Yang
Xinli Xu
Haixia Long
Ying Tang
92
19
0
05 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han
Qifan Wang
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Yi Fang
Qiang Guan
Lifu Huang
Dongfang Liu
VLM
80
5
0
05 Jul 2024
QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024
Zeyun Zhong
Manuel Martin
Frederik Diederichs
Juergen Beyerer
74
5
0
04 Jul 2024
Occupancy as Set of Points
Yiang Shi
Tianheng Cheng
Qian Zhang
Wenyu Liu
Xinggang Wang
3DPC
104
15
0
04 Jul 2024
Cyclic Refiner: Object-Aware Temporal Representation Learning for Multi-View 3D Detection and Tracking
Mingzhe Guo
Zhipeng Zhang
Liping Jing
Yuan He
Ke Wang
Heng Fan
100
1
0
03 Jul 2024
Context-Aware Video Instance Segmentation
Seunghun Lee
Jiwan Seo
Kiljoon Han
Minwoo Choi
S. Im
VOS
77
0
0
03 Jul 2024
Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation
Mengmeng Cui
Kunbo Zhang
Zhenan Sun
ViT
70
0
0
03 Jul 2024
Previous
1
2
3
...
9
10
11
...
49
50
51
Next