ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.12872
  4. Cited By
End-to-End Object Detection with Transformers

End-to-End Object Detection with Transformers

26 May 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
    ViT
    3DV
    PINN
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Transformers"

50 / 5,293 papers shown
Title
CD-FSOD: A Benchmark for Cross-domain Few-shot Object Detection
CD-FSOD: A Benchmark for Cross-domain Few-shot Object Detection
Wuti Xiong
75
13
0
11 Oct 2022
Fine-Grained Image Style Transfer with Visual Transformers
Fine-Grained Image Style Transfer with Visual Transformers
Jianbo Wang
Huan Yang
Jianlong Fu
T. Yamasaki
B. Guo
ViT
71
13
0
11 Oct 2022
BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised
  Instance Segmentation
BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation
Tianheng Cheng
Xinggang Wang
Shaoyu Chen
Qian Zhang
Wenyu Liu
ISeg
40
42
0
11 Oct 2022
EarthNets: Empowering AI in Earth Observation
EarthNets: Empowering AI in Earth Observation
Zhitong Xiong
Fahong Zhang
Yi Wang
Yilei Shi
Xiao Xiang Zhu
101
74
0
10 Oct 2022
FS-DETR: Few-Shot DEtection TRansformer with prompting and without
  re-training
FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training
Adrian Bulat
Ricardo Guerrero
Brais Martínez
Georgios Tzimiropoulos
47
30
0
10 Oct 2022
4D Unsupervised Object Discovery
4D Unsupervised Object Discovery
Yu-Quan Wang
Yuntao Chen
Zhaoxiang Zhang
3DPC
76
19
0
10 Oct 2022
Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue
Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue
So Yeon Min
Hao Zhu
Ruslan Salakhutdinov
Yonatan Bisk
LM&Ro
66
12
0
10 Oct 2022
Deep Learning for Logo Detection: A Survey
Deep Learning for Logo Detection: A Survey
Sujuan Hou
Jiacheng Li
Weiqing Min
Qiang Hou
Yanna Zhao
Yuanjie Zheng
Shuqiang Jiang
51
19
0
10 Oct 2022
DCVQE: A Hierarchical Transformer for Video Quality Assessment
DCVQE: A Hierarchical Transformer for Video Quality Assessment
Zu-Hua Li
Lei Yang
ViT
37
2
0
10 Oct 2022
Transformer-based Flood Scene Segmentation for Developing Countries
Transformer-based Flood Scene Segmentation for Developing Countries
R. AhanM.
Roshan Roy
S. Kulkarni
Vaibhav Soni
Ashish Chittora
ViT
8
5
0
09 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
43
3
0
09 Oct 2022
Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs
Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs
Taojiannan Yang
Haokui Zhang
Wenze Hu
Chen Chen
Xiaoyu Wang
ViT
24
0
0
08 Oct 2022
Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine
  Translation
Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation
Chenze Shao
Yang Feng
38
20
0
08 Oct 2022
Towards Light Weight Object Detection System
Towards Light Weight Object Detection System
K. Dharma
V. Dayana
Menglan Wu
Venkateswara Rao Cherukuri
Hau Hwang
18
1
0
08 Oct 2022
Humans need not label more humans: Occlusion Copy & Paste for Occluded
  Human Instance Segmentation
Humans need not label more humans: Occlusion Copy & Paste for Occluded Human Instance Segmentation
Evan Ling
De-Kai Huang
Minhoe Hur
27
5
0
07 Oct 2022
IronDepth: Iterative Refinement of Single-View Depth using Surface
  Normal and its Uncertainty
IronDepth: Iterative Refinement of Single-View Depth using Surface Normal and its Uncertainty
Gwangbin Bae
Ignas Budvytis
R. Cipolla
22
27
0
07 Oct 2022
Time-Space Transformers for Video Panoptic Segmentation
Time-Space Transformers for Video Panoptic Segmentation
Andra Petrovai
S. Nedevschi
ViT
27
3
0
07 Oct 2022
Trans2k: Unlocking the Power of Deep Models for Transparent Object
  Tracking
Trans2k: Unlocking the Power of Deep Models for Transparent Object Tracking
A. Lukežič
Ziga Trojer
Juan E. Sala Matas
Matej Kristan
ViT
34
6
0
07 Oct 2022
EmbryosFormer: Deformable Transformer and Collaborative
  Encoding-Decoding for Embryos Stage Development Classification
EmbryosFormer: Deformable Transformer and Collaborative Encoding-Decoding for Embryos Stage Development Classification
Tien-Phat Nguyen
Trong-Thang Pham
Tri Minh Nguyen
Hung Le
Dung Nguyen
Hau Lam
Phong H. Nguyen
Jennifer Fowler
Minh-Triet Tran
Ngan Le
ViT
35
14
0
07 Oct 2022
Mask3D: Mask Transformer for 3D Semantic Instance Segmentation
Mask3D: Mask Transformer for 3D Semantic Instance Segmentation
Jonas Schult
Francis Engelmann
Alexander Hermans
Or Litany
Siyu Tang
Bastian Leibe
ISeg
50
168
0
06 Oct 2022
Video Referring Expression Comprehension via Transformer with
  Content-aware Query
Video Referring Expression Comprehension via Transformer with Content-aware Query
Ji Jiang
Meng Cao
Tengtao Song
Yuexian Zou
32
5
0
06 Oct 2022
A Review of Uncertainty Calibration in Pretrained Object Detectors
A Review of Uncertainty Calibration in Pretrained Object Detectors
Denis Huseljic
M. Herde
Mehmet Muejde
Bernhard Sick
UQCV
16
0
0
06 Oct 2022
Focal and Global Spatial-Temporal Transformer for Skeleton-based Action
  Recognition
Focal and Global Spatial-Temporal Transformer for Skeleton-based Action Recognition
Zhimin Gao
Peitao Wang
Pei Lv
Xiaoheng Jiang
Qi-dong Liu
Pichao Wang
Mingliang Xu
Wanqing Li
ViT
57
27
0
06 Oct 2022
Towards Better Semantic Understanding of Mobile Interfaces
Towards Better Semantic Understanding of Mobile Interfaces
Srinivas Sunkara
Maria Wang
Lijuan Liu
Gilles Baechler
Yu-Chung Hsiao
Jindong Chen
Chen
Abhanshu Sharma
James Stout
36
23
0
06 Oct 2022
Depth Is All You Need for Monocular 3D Detection
Depth Is All You Need for Monocular 3D Detection
Dennis Park
Jie Li
Di Chen
Vitor Campagnolo Guizilini
Adrien Gaidon
3DPC
MDE
48
7
0
05 Oct 2022
Spatio-Temporal Learnable Proposals for End-to-End Video Object
  Detection
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
K. Hashmi
D. Stricker
Muhammamd Zeshan Afzal
30
7
0
05 Oct 2022
FQDet: Fast-converging Query-based Detector
FQDet: Fast-converging Query-based Detector
Cédric Picron
Punarjay Chakravarty
Tinne Tuytelaars
ObjD
46
2
0
05 Oct 2022
Centralized Feature Pyramid for Object Detection
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
36
156
0
05 Oct 2022
Point Cloud Recognition with Position-to-Structure Attention
  Transformers
Point Cloud Recognition with Position-to-Structure Attention Transformers
Zhenghu Ding
James Hou
Zhuowen Tu
ViT
3DPC
46
1
0
05 Oct 2022
Multi-view Human Body Mesh Translator
Multi-view Human Body Mesh Translator
Xiangjian Jiang
Xuecheng Nie
Zitian Wang
Luoqi Liu
Si Liu
33
4
0
04 Oct 2022
A Perceptual Quality Metric for Video Frame Interpolation
A Perceptual Quality Metric for Video Frame Interpolation
Qiqi Hou
Abhijay Ghildyal
Feng Liu
25
20
0
04 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision
  Models
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
46
60
0
04 Oct 2022
Bridged Transformer for Vision and Point Cloud 3D Object Detection
Bridged Transformer for Vision and Point Cloud 3D Object Detection
Yikai Wang
Tengqi Ye
Lele Cao
Wen-bing Huang
Gang Hua
Fengxiang He
Dacheng Tao
ViT
57
34
0
04 Oct 2022
Towards Flexible Inductive Bias via Progressive Reparameterization
  Scheduling
Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
Yunsung Lee
Gyuseong Lee
Kwang-seok Ryoo
Hyojun Go
Jihye Park
Seung Wook Kim
32
5
0
04 Oct 2022
Introducing Vision Transformer for Alzheimer's Disease classification
  task with 3D input
Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input
Zilun Zhang
Farzad Khalvati
MedIm
ViT
22
9
0
03 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng Zhang
Chao Zhang
Hanhua Hu
45
26
0
03 Oct 2022
Attention Distillation: self-supervised vision transformer students need
  more guidance
Attention Distillation: self-supervised vision transformer students need more guidance
Kai Wang
Fei Yang
Joost van de Weijer
ViT
30
16
0
03 Oct 2022
Enhancing Fine-Grained 3D Object Recognition using Hybrid Multi-Modal
  Vision Transformer-CNN Models
Enhancing Fine-Grained 3D Object Recognition using Hybrid Multi-Modal Vision Transformer-CNN Models
Songsong Xiong
Georgios Tziafas
Hamidreza Kasaei
ViT
31
3
0
03 Oct 2022
Learning Equivariant Segmentation with Instance-Unique Querying
Learning Equivariant Segmentation with Instance-Unique Querying
Wenguan Wang
James Liang
Dongfang Liu
ISeg
48
48
0
03 Oct 2022
Fully Transformer Network for Change Detection of Remote Sensing Images
Fully Transformer Network for Change Detection of Remote Sensing Images
Tianyu Yan
Zifu Wan
Pingping Zhang
ViT
82
54
0
03 Oct 2022
CERBERUS: Simple and Effective All-In-One Automotive Perception Model
  with Multi Task Learning
CERBERUS: Simple and Effective All-In-One Automotive Perception Model with Multi Task Learning
Carmelo Scribano
Giorgia Franchini
Ignacio Sañudo Olmedo
Marko Bertogna
32
0
0
03 Oct 2022
Exploiting More Information in Sparse Point Cloud for 3D Single Object
  Tracking
Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking
Yubo Cui
Jiayao Shan
Zuoxu Gu
Zhiheng Li
Zheng Fang
29
24
0
02 Oct 2022
Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease
  Classification with Incomplete Data
Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease Classification with Incomplete Data
Linfeng Liu
Siyu Liu
Lu Zhang
X. To
F. Nasrallah
Shekhar S. Chandra
MedIm
41
54
0
01 Oct 2022
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language
  Models
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Weicheng Kuo
Huayu Chen
Xiuye Gu
A. Piergiovanni
A. Angelova
MLLM
VLM
ObjD
63
134
0
30 Sep 2022
A Closer Look at Temporal Ordering in the Segmentation of Instructional
  Videos
A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos
Anil Batra
Shreyank N. Gowda
Frank Keller
Laura Sevilla-Lara
49
5
0
30 Sep 2022
Rethinking the Learning Paradigm for Facial Expression Recognition
Rethinking the Learning Paradigm for Facial Expression Recognition
Weijie Wang
N. Sebe
Bruno Lepri
40
2
0
30 Sep 2022
Transformers for Object Detection in Large Point Clouds
Transformers for Object Detection in Large Point Clouds
Felicia Ruppel
F. Faion
Claudius Gläser
Klaus C. J. Dietmayer
ViT
34
5
0
30 Sep 2022
Dilated Neighborhood Attention Transformer
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
33
68
0
29 Sep 2022
Spotlight: Mobile UI Understanding using Vision-Language Models with a
  Focus
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus
Gang Li
Yang Li
32
67
0
29 Sep 2022
Spikformer: When Spiking Neural Network Meets Transformer
Spikformer: When Spiking Neural Network Meets Transformer
Zhaokun Zhou
Yuesheng Zhu
Chao He
Yaowei Wang
Shuicheng Yan
Yonghong Tian
Liuliang Yuan
147
249
0
29 Sep 2022
Previous
123...737475...104105106
Next