ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.12872
  4. Cited By
End-to-End Object Detection with Transformers

End-to-End Object Detection with Transformers

26 May 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
    ViT
    3DV
    PINN
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Transformers"

50 / 5,161 papers shown
Title
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
85
97
0
07 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language
  Modeling
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
189
385
0
06 Nov 2021
Improving Visual Quality of Image Synthesis by A Token-based Generator
  with Transformers
Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers
Yanhong Zeng
Huan Yang
Hongyang Chao
Jianbo Wang
Jianlong Fu
ViT
27
26
0
05 Nov 2021
Bootstrap Your Object Detector via Mixed Training
Bootstrap Your Object Detector via Mixed Training
Mengde Xu
Zheng-Wei Zhang
Fangyun Wei
Yutong Lin
Yue Cao
Stephen Lin
Han Hu
Xiang Bai
ObjD
19
6
0
04 Nov 2021
STC speaker recognition systems for the NIST SRE 2021
STC speaker recognition systems for the NIST SRE 2021
Anastasia Avdeeva
Aleksei Gusev
Igor Korsunov
Alexander Kozlov
G. Lavrentyeva
...
Andrey Shulipa
Alisa Vinogradova
V. Volokhov
Evgeny Smirnov
Vasily Galyuk
11
15
0
03 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video
  Understanding
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
27
28
0
02 Nov 2021
Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Jiaqi Gu
Hyoukjun Kwon
Dilin Wang
Wei Ye
Meng Li
Yu-Hsin Chen
Liangzhen Lai
Vikas Chandra
D. Pan
ViT
24
182
0
01 Nov 2021
Livestock Monitoring with Transformer
Livestock Monitoring with Transformer
Bhavesh Tangirala
Ishan Bhandari
Dániel László
D. K. Gupta
R. Thomas
Devanshu Arya
38
6
0
01 Nov 2021
A Simple Approach to Image Tilt Correction with Self-Attention MobileNet
  for Smartphones
A Simple Approach to Image Tilt Correction with Self-Attention MobileNet for Smartphones
Siddhant Garg
D. Mohanty
S. Thota
Sukumar Moharana
ViT
11
2
0
31 Oct 2021
Blending Anti-Aliasing into Vision Transformer
Blending Anti-Aliasing into Vision Transformer
Shengju Qian
Hao Shao
Yi Zhu
Mu Li
Jiaya Jia
26
20
0
28 Oct 2021
3D Object Tracking with Transformer
3D Object Tracking with Transformer
Yubo Cui
Zheng Fang
Jiayao Shan
Zuoxu Gu
Sifan Zhou
ViT
3DPC
13
59
0
28 Oct 2021
HR-RCNN: Hierarchical Relational Reasoning for Object Detection
HR-RCNN: Hierarchical Relational Reasoning for Object Detection
Hao Chen
Abhinav Shrivastava
17
1
0
26 Oct 2021
Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding
  Box Regression
Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression
Jiabo He
S. Erfani
Xingjun Ma
James Bailey
Ying Chi
Xiansheng Hua
34
248
0
26 Oct 2021
TriBERT: Full-body Human-centric Audio-visual Representation Learning
  for Visual Sound Separation
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
26
8
0
26 Oct 2021
Instance-Conditional Knowledge Distillation for Object Detection
Instance-Conditional Knowledge Distillation for Object Detection
Zijian Kang
Peizhen Zhang
X. Zhang
Jian-jun Sun
N. Zheng
19
76
0
25 Oct 2021
Exploiting Inter-pixel Correlations in Unsupervised Domain Adaptation
  for Semantic Segmentation
Exploiting Inter-pixel Correlations in Unsupervised Domain Adaptation for Semantic Segmentation
Inseop Chung
Jayeon Yoo
Nojun Kwak
23
4
0
21 Oct 2021
ESOD:Edge-based Task Scheduling for Object Detection
ESOD:Edge-based Task Scheduling for Object Detection
Yihao Wang
Ling Gao
J. Ren
Rui Cao
Hai Wang
Jie Zheng
Quanli Gao
19
0
0
20 Oct 2021
AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
Xiangyi Yan
Hao Tang
Shanlin Sun
Haoyu Ma
Deying Kong
Xiaohui Xie
ViT
MedIm
19
127
0
20 Oct 2021
TransFusion: Cross-view Fusion with Transformer for 3D Human Pose
  Estimation
TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Haoyu Ma
Liangjian Chen
Deying Kong
Zhe Wang
Xingwei Liu
Hao Tang
Xiangyi Yan
Yusheng Xie
Shi-yao Lin
Xiaohui Xie
ViT
19
61
0
18 Oct 2021
Unsupervised Finetuning
Unsupervised Finetuning
Suichan Li
Dongdong Chen
Yinpeng Chen
Lu Yuan
Lei Zhang
Qi Chu
B. Liu
Nenghai Yu
30
8
0
18 Oct 2021
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
28
3
0
18 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
38
227
0
18 Oct 2021
Finding Strong Gravitational Lenses Through Self-Attention
Finding Strong Gravitational Lenses Through Self-Attention
H. Thuruthipilly
A. Zadrożny
Agnieszka Pollo
Marek Biesiada
16
6
0
18 Oct 2021
Siamese Transformer Pyramid Networks for Real-Time UAV Tracking
Siamese Transformer Pyramid Networks for Real-Time UAV Tracking
Daitao Xing
N. Evangeliou
Athanasios Tsoukalas
Anthony Tzes
ViT
33
54
0
17 Oct 2021
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Bingbing Li
Hongwu Peng
Rajat Sainju
Junhuan Yang
Lei Yang
Yueying Liang
Weiwen Jiang
Binghui Wang
Hang Liu
Caiwen Ding
15
11
0
15 Oct 2021
Transformer for Polyp Detection
Transformer for Polyp Detection
Shijie Liu
Hongyu Zhou
Xiaozhou Shi
Junwen Pan
ViT
MedIm
32
4
0
14 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
229
1,019
0
13 Oct 2021
Object DGCNN: 3D Object Detection using Dynamic Graphs
Object DGCNN: 3D Object Detection using Dynamic Graphs
Yue Wang
Justin Solomon
3DPC
157
104
0
13 Oct 2021
Object-Region Video Transformers
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
21
82
0
13 Oct 2021
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang
Pei Sun
Yi-Xin Jiang
Dongdong Yu
Fucheng Weng
Zehuan Yuan
Ping Luo
Wenyu Liu
Xinggang Wang
VOT
107
1,330
0
13 Oct 2021
The Dawn of Quantum Natural Language Processing
The Dawn of Quantum Natural Language Processing
R. Sipio
Jia-Hong Huang
Samuel Yen-Chi Chen
Stefano Mangini
M. Worring
50
80
0
13 Oct 2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual
  Representation Learning
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning
Chongjian Ge
Youwei Liang
Yibing Song
Jianbo Jiao
Jue Wang
Ping Luo
ViT
21
36
0
11 Oct 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
65
982
0
09 Oct 2021
Context-LGM: Leveraging Object-Context Relation for Context-Aware Object
  Recognition
Context-LGM: Leveraging Object-Context Relation for Context-Aware Object Recognition
Mingzhou Liu
Xinwei Sun
Fandong Zhang
Yizhou Yu
Yizhou Wang
27
0
0
08 Oct 2021
Trident Pyramid Networks: The importance of processing at the feature
  pyramid level for better object detection
Trident Pyramid Networks: The importance of processing at the feature pyramid level for better object detection
Cédric Picron
Tinne Tuytelaars
22
4
0
08 Oct 2021
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
87
76
0
08 Oct 2021
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
Despoina Paschalidou
Amlan Kar
Maria Shugrina
Karsten Kreis
Andreas Geiger
Sanja Fidler
3DV
ViT
33
148
0
07 Oct 2021
MPSN: Motion-aware Pseudo Siamese Network for Indoor Video Head
  Detection in Buildings
MPSN: Motion-aware Pseudo Siamese Network for Indoor Video Head Detection in Buildings
Kailai Sun
Xiaoteng Ma
Peng Liu
Qianchuan Zhao
3DPC
AAML
25
11
0
07 Oct 2021
MetaCOG: A Hierarchical Probabilistic Model for Learning Meta-Cognitive
  Visual Representations
MetaCOG: A Hierarchical Probabilistic Model for Learning Meta-Cognitive Visual Representations
Marlene D. Berke
Zhangir Azerbayev
M. Belledonne
Zenna Tavares
J. Jara-Ettinger
16
1
0
06 Oct 2021
Dynamically Decoding Source Domain Knowledge for Domain Generalization
Dynamically Decoding Source Domain Knowledge for Domain Generalization
Cuicui Kang
Karthik Nandakumar
OOD
ViT
29
1
0
06 Oct 2021
Objects in Semantic Topology
Objects in Semantic Topology
Shuo Yang
Pei Sun
Yi-Xin Jiang
Xiaobo Xia
Ruiheng Zhang
Zehuan Yuan
Changhu Wang
Ping Luo
Min Xu
ObjD
89
29
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
26
3
0
06 Oct 2021
Sound Event Detection Transformer: An Event-based End-to-End Model for
  Sound Event Detection
Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection
Zhi-qin Ye
Xiangdong Wang
Hong Liu
Yueliang Qian
Ruijie Tao
Long Yan
Kazushige Ouchi
ViT
29
15
0
05 Oct 2021
Translating Images into Maps
Translating Images into Maps
Avishkar Saha
Oscar Alejandro Mendez Maldonado
Chris Russell
Richard Bowden
ViT
21
144
0
03 Oct 2021
Seeing Glass: Joint Point Cloud and Depth Completion for Transparent
  Objects
Seeing Glass: Joint Point Cloud and Depth Completion for Transparent Objects
Haoping Xu
Yi Ru Wang
S. Eppel
Alán Aspuru-Guzik
Florian Shkurti
Animesh Garg
99
52
0
30 Sep 2021
PubTables-1M: Towards comprehensive table extraction from unstructured
  documents
PubTables-1M: Towards comprehensive table extraction from unstructured documents
B. Smock
Rohith Pesala
Robin Abraham
LMTD
27
96
0
30 Sep 2021
Semantic Dense Reconstruction with Consistent Scene Segments
Semantic Dense Reconstruction with Consistent Scene Segments
Yingcai Wan
Yanyan Li
Yingxuan You
Cheng Guo
Lijin Fang
F. Tombari
3DV
16
1
0
30 Sep 2021
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent
  Path Finding
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent Path Finding
Lakshay Virmani
Z. Ren
Sivakumar Rathinam
Howie Choset
26
3
0
29 Sep 2021
CCTrans: Simplifying and Improving Crowd Counting with Transformer
CCTrans: Simplifying and Improving Crowd Counting with Transformer
Ye Tian
Xiangxiang Chu
Hongpeng Wang
ViT
21
75
0
29 Sep 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
Previous
123...969798...102103104
Next