ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.12872
  4. Cited By
End-to-End Object Detection with Transformers

End-to-End Object Detection with Transformers

26 May 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
    ViT
    3DV
    PINN
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Transformers"

50 / 5,279 papers shown
Title
Dense Learning based Semi-Supervised Object Detection
Dense Learning based Semi-Supervised Object Detection
Binghui Chen
Pengyu Li
Xiang Chen
Biao Wang
Lei Zhang
Xia Hua
ObjD
42
64
0
15 Apr 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
28
124
0
14 Apr 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
36
255
0
14 Apr 2022
DeiT III: Revenge of the ViT
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
48
393
0
14 Apr 2022
Residual Swin Transformer Channel Attention Network for Image
  Demosaicing
Residual Swin Transformer Channel Attention Network for Image Demosaicing
W. Xing
K. Egiazarian
ViT
19
14
0
14 Apr 2022
DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific
  Visualization
DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization
Chaoli Wang
J. Han
41
36
0
13 Apr 2022
Localization Distillation for Object Detection
Localization Distillation for Object Detection
Zhaohui Zheng
Rongguang Ye
Ping Wang
Dongwei Ren
Jun Wang
W. Zuo
Ming-Ming Cheng
32
64
0
12 Apr 2022
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Zhaowei Cai
Gukyeong Kwon
Avinash Ravichandran
Erhan Bas
Zhuowen Tu
Rahul Bhotika
Stefano Soatto
ObjD
MLLM
VLM
19
49
0
12 Apr 2022
Towards Open-Set Object Detection and Discovery
Towards Open-Set Object Detection and Discovery
Jiyang Zheng
Weihao Li
Jie Hong
L. Petersson
Nick Barnes
ObjD
41
61
0
12 Apr 2022
NightLab: A Dual-level Architecture with Hardness Detection for
  Segmentation at Night
NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night
XueQing Deng
Peng Wang
Xiaochen Lian
Shawn D. Newsam
41
35
0
12 Apr 2022
HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud
HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud
Zhixing Hou
Yan Yan
Chengzhong Xu
Hui Kong
ViT
27
24
0
12 Apr 2022
Glass Segmentation with RGB-Thermal Image Pairs
Glass Segmentation with RGB-Thermal Image Pairs
Dong Huo
Jian Wang
Yiming Qian
Yee-Hong Yang
ISeg
34
40
0
12 Apr 2022
M$^2$BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified
  Birds-Eye View Representation
M2^22BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation
Enze Xie
Zhiding Yu
Daquan Zhou
Jonah Philion
Anima Anandkumar
Sanja Fidler
Ping Luo
J. Álvarez
52
181
0
11 Apr 2022
Category-Aware Transformer Network for Better Human-Object Interaction
  Detection
Category-Aware Transformer Network for Better Human-Object Interaction Detection
Leizhen Dong
Zhimin Li
Kunlun Xu
Zhijun Zhang
Luxin Yan
Sheng Zhong
Xu Zou
ViT
24
30
0
11 Apr 2022
No Token Left Behind: Explainability-Aided Image Classification and
  Generation
No Token Left Behind: Explainability-Aided Image Classification and Generation
Roni Paiss
Hila Chefer
Lior Wolf
VLM
34
29
0
11 Apr 2022
Consistency Learning via Decoding Path Augmentation for Transformers in
  Human Object Interaction Detection
Consistency Learning via Decoding Path Augmentation for Transformers in Human Object Interaction Detection
Jihwan Park
Seungjun Lee
Hwan Heo
Hyeong Kyu Choi
Hyunwoo J. Kim
19
23
0
11 Apr 2022
OutfitTransformer: Learning Outfit Representations for Fashion
  Recommendation
OutfitTransformer: Learning Outfit Representations for Fashion Recommendation
Rohan Sarkar
Navaneeth Bodla
Mariya I. Vasileva
Yen-Liang Lin
Anu Beniwal
Alan Lu
Gérard Medioni
27
35
0
11 Apr 2022
Generative Adversarial Networks for Image Augmentation in Agriculture: A
  Systematic Review
Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic Review
E. Olaniyi
Dong Chen
Yuzhen Lu
Ya-Yu Huang
23
38
0
10 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
22
31
0
10 Apr 2022
Video K-Net: A Simple, Strong, and Unified Baseline for Video
  Segmentation
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
Xiangtai Li
Wenwei Zhang
Jiangmiao Pang
Kai-xiang Chen
Guangliang Cheng
Yunhai Tong
Chen Change Loy
VOS
39
87
0
10 Apr 2022
Panoptic-PartFormer: Learning a Unified Model for Panoptic Part
  Segmentation
Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Xiangtai Li
Shilin Xu
Yibo Yang
Guangliang Cheng
Yunhai Tong
Dacheng Tao
ViT
19
46
0
10 Apr 2022
Fashionformer: A simple, Effective and Unified Baseline for Human
  Fashion Segmentation and Recognition
Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition
Shilin Xu
Xiangtai Li
Jingbo Wang
Guangliang Cheng
Yunhai Tong
Dacheng Tao
ViT
28
27
0
10 Apr 2022
Stripformer: Strip Transformer for Fast Image Deblurring
Stripformer: Strip Transformer for Fast Image Deblurring
Fu-Jen Tsai
Yan-Tsung Peng
Yen-Yu Lin
Chung-Chi Tsai
Chia-Wen Lin
ViT
21
173
0
10 Apr 2022
Efficient tracking of team sport players with few game-specific
  annotations
Efficient tracking of team sport players with few game-specific annotations
Adrien Maglo
Astrid Orcesi
Q. C. Pham
29
25
0
08 Apr 2022
Points to Patches: Enabling the Use of Self-Attention for 3D Shape
  Recognition
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Axel Berg
Magnus Oskarsson
Mark O'Connor
3DPC
ViT
29
26
0
08 Apr 2022
Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes
  for Medical Image Super-Resolution
Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution
Mariana-Iuliana Georgescu
Radu Tudor Ionescu
A. Miron
O. Savencu
Nicolae-Cătălin Ristea
N. Verga
Fahad Shahbaz Khan
SupR
26
47
0
08 Apr 2022
Reusing the Task-specific Classifier as a Discriminator:
  Discriminator-free Adversarial Domain Adaptation
Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation
Lin Chen
H. Chen
Zhixiang Wei
Xin Jin
Xiao Tan
Yi Jin
Enhong Chen
33
103
0
08 Apr 2022
Learning Trajectory-Aware Transformer for Video Super-Resolution
Learning Trajectory-Aware Transformer for Video Super-Resolution
Chengxu Liu
Huan Yang
Jianlong Fu
Xueming Qian
ViT
38
82
0
08 Apr 2022
Surface Vision Transformers: Flexible Attention-Based Modelling of
  Biomedical Surfaces
Surface Vision Transformers: Flexible Attention-Based Modelling of Biomedical Surfaces
Simon Dahan
Hao Xu
Logan Z. J. Williams
Abdulah Fawaz
Chunhui Yang
...
A. Edwards
M. Glasser
Alistair Young
Daniel Rueckert
E. C. Robinson
ViT
MedIm
35
0
0
07 Apr 2022
Event Transformer. A sparse-aware solution for efficient event data
  processing
Event Transformer. A sparse-aware solution for efficient event data processing
Alberto Sabater
Luis Montesano
Ana C. Murillo
34
51
0
07 Apr 2022
PSTR: End-to-End One-Step Person Search With Transformers
PSTR: End-to-End One-Step Person Search With Transformers
Jiale Cao
Yanwei Pang
Rao Muhammad Anwer
Hisham Cholakkal
J. Xie
M. Shah
Fahad Shahbaz Khan
ViT
27
50
0
07 Apr 2022
Low-Dose CT Denoising via Sinogram Inner-Structure Transformer
Low-Dose CT Denoising via Sinogram Inner-Structure Transformer
Liutao Yang
Zhongnian Li
Rongjun Ge
Junyong Zhao
Haipeng Si
Daoqiang Zhang
MedIm
31
52
0
07 Apr 2022
Winoground: Probing Vision and Language Models for Visio-Linguistic
  Compositionality
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
CoGe
42
404
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for
  Object Detection
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
31
55
0
06 Apr 2022
An Empirical Study of End-to-End Temporal Action Detection
An Empirical Study of End-to-End Temporal Action Detection
Xiaolong Liu
S. Bai
Xiang Bai
27
58
0
06 Apr 2022
End-to-End Instance Edge Detection
End-to-End Instance Edge Detection
Xueyan Zou
Haotian Liu
Yong Jae Lee
32
2
0
06 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Zerui Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
54
149
0
06 Apr 2022
Modeling Motion with Multi-Modal Features for Text-Based Video
  Segmentation
Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
Wangbo Zhao
Kai Wang
Xiangxiang Chu
Fuzhao Xue
Xinchao Wang
Yang You
29
21
0
06 Apr 2022
SALISA: Saliency-based Input Sampling for Efficient Video Object
  Detection
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection
B. Bejnordi
A. Habibian
Fatih Porikli
Amir Ghodrati
52
12
0
05 Apr 2022
Dual-AI: Dual-path Actor Interaction Learning for Group Activity
  Recognition
Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition
Mingfei Han
David Junhao Zhang
Yali Wang
Rui Yan
L. Yao
Xiaojun Chang
Yu Qiao
21
55
0
05 Apr 2022
Detector-Free Weakly Supervised Group Activity Recognition
Detector-Free Weakly Supervised Group Activity Recognition
Dongkeun Kim
Jin S. Lee
Minsu Cho
Suha Kwak
ViT
31
44
0
05 Apr 2022
Text Spotting Transformers
Text Spotting Transformers
Xiang Zhang
Yongwen Su
Subarna Tripathi
Zhuowen Tu
ViT
34
91
0
05 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
62
639
0
04 Apr 2022
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric
  Videos
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
Shao-Wei Liu
Subarna Tripathi
Somdeb Majumdar
Xiaolong Wang
EgoV
45
93
0
04 Apr 2022
BatchFormerV2: Exploring Sample Relationships for Dense Representation
  Learning
BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning
Zhi Hou
Baosheng Yu
Chaoyue Wang
Yibing Zhan
Dacheng Tao
ViT
32
11
0
04 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
34
17
0
04 Apr 2022
Improving Vision Transformers by Revisiting High-frequency Components
Improving Vision Transformers by Revisiting High-frequency Components
Jiawang Bai
Liuliang Yuan
Shutao Xia
Shuicheng Yan
Zhifeng Li
Wen Liu
ViT
16
90
0
03 Apr 2022
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation
Zhenyu Li
Xuyang Wang
Xianming Liu
Junjun Jiang
MDE
31
192
0
03 Apr 2022
R(Det)^2: Randomized Decision Routing for Object Detection
R(Det)^2: Randomized Decision Routing for Object Detection
Yali Li
Shengjin Wang
ObjD
25
9
0
02 Apr 2022
What to look at and where: Semantic and Spatial Refined Transformer for
  detecting human-object interactions
What to look at and where: Semantic and Spatial Refined Transformer for detecting human-object interactions
A S M Iftekhar
Hao Chen
Kaustav Kundu
Xinyu Li
Joseph Tighe
Davide Modolo
ViT
37
50
0
02 Apr 2022
Previous
123...868788...104105106
Next