ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.12872
  4. Cited By
End-to-End Object Detection with Transformers

End-to-End Object Detection with Transformers

26 May 2020
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
    ViT
    3DV
    PINN
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Transformers"

50 / 5,167 papers shown
Title
Masked-attention Mask Transformer for Universal Image Segmentation
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
A. Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
96
2,269
0
02 Dec 2021
Self-supervised Video Transformer
Self-supervised Video Transformer
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
F. Khan
Michael S. Ryoo
ViT
33
84
0
02 Dec 2021
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
Liting Lin
Heng Fan
Zhipeng Zhang
Yong-mei Xu
Haibin Ling
ViT
31
302
0
02 Dec 2021
Vision Pair Learning: An Efficient Training Framework for Image
  Classification
Vision Pair Learning: An Efficient Training Framework for Image Classification
Bei Tong
Xiaoyuan Yu
ViT
17
0
0
02 Dec 2021
Systematic Generalization with Edge Transformers
Systematic Generalization with Edge Transformers
Leon Bergen
Timothy J. O'Donnell
Dzmitry Bahdanau
10
46
0
01 Dec 2021
Confidence Propagation Cluster: Unleash Full Potential of Object
  Detectors
Confidence Propagation Cluster: Unleash Full Potential of Object Detectors
Yichun Shen
Wanli Jiang
Zhen Xu
Rundong Li
Junghyun Kwon
Siyi Li
ObjD
22
9
0
01 Dec 2021
Multi-View Stereo with Transformer
Multi-View Stereo with Transformer
Jie Zhu
Bo Peng
Wanqing Li
Haifeng Shen
Zhe Zhang
Jianjun Lei
ViT
29
27
0
01 Dec 2021
CT-block: a novel local and global features extractor for point cloud
CT-block: a novel local and global features extractor for point cloud
Shangwei Guo
Jun Li
Zhengchao Lai
Xiantong Meng
Shaokun Han
ViT
3DPC
21
2
0
30 Nov 2021
SP-SEDT: Self-supervised Pre-training for Sound Event Detection
  Transformer
SP-SEDT: Self-supervised Pre-training for Sound Event Detection Transformer
Zhi-qin Ye
Xiangdong Wang
Hong Liu
Yueliang Qian
Ruijie Tao
Long Yan
Kazushige Ouchi
ViT
16
2
0
30 Nov 2021
CRIS: CLIP-Driven Referring Image Segmentation
CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
VLM
40
359
0
30 Nov 2021
MultiPath++: Efficient Information Fusion and Trajectory Aggregation for
  Behavior Prediction
MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction
Balakrishnan Varadarajan
Ahmed S. Hefny
A. Srivastava
Khaled S. Refaat
Nigamaa Nayakanti
...
K. Chen
B. Douillard
C. Lam
Drago Anguelov
Benjamin Sapp
35
306
0
29 Nov 2021
End-to-End Referring Video Object Segmentation with Multimodal
  Transformers
End-to-End Referring Video Object Segmentation with Multimodal Transformers
Adam Botach
Evgenii Zheltonozhskii
Chaim Baskin
VOS
25
140
0
29 Nov 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
  Modeling
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
42
653
0
29 Nov 2021
TransWeather: Transformer-based Restoration of Images Degraded by
  Adverse Weather Conditions
TransWeather: Transformer-based Restoration of Images Degraded by Adverse Weather Conditions
Jeya Maria Jose Valanarasu
R. Yasarla
Vishal M. Patel
ViT
48
275
0
29 Nov 2021
UBoCo : Unsupervised Boundary Contrastive Learning for Generic Event
  Boundary Detection
UBoCo : Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection
Hyolim Kang
Jinwoo Kim
Taehyun Kim
Seon Joo Kim
39
25
0
29 Nov 2021
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Recurrent Vision Transformer for Solving Visual Reasoning Problems
Nicola Messina
Giuseppe Amato
F. Carrara
Claudio Gennaro
Fabrizio Falchi
ViT
LRM
22
11
0
29 Nov 2021
On the Integration of Self-Attention and Convolution
On the Integration of Self-Attention and Convolution
Xuran Pan
Chunjiang Ge
Rui Lu
S. Song
Guanfu Chen
Zeyi Huang
Gao Huang
SSL
41
287
0
29 Nov 2021
Agent-Centric Relation Graph for Object Visual Navigation
Agent-Centric Relation Graph for Object Visual Navigation
X. Hu
Youfang Lin
Shuo Wang
Zhihao Wu
Kai Lv
36
19
0
29 Nov 2021
Sparse DETR: Efficient End-to-End Object Detection with Learnable
  Sparsity
Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity
Byungseok Roh
Jaewoong Shin
Wuhyun Shin
Saehoon Kim
ViT
11
142
0
29 Nov 2021
Video Frame Interpolation Transformer
Video Frame Interpolation Transformer
Zhihao Shi
Xiangyu Xu
Xiaohong Liu
Jun Chen
Ming-Hsuan Yang
ViT
17
157
0
27 Nov 2021
GMFlow: Learning Optical Flow via Global Matching
GMFlow: Learning Optical Flow via Global Matching
Haofei Xu
Jing Zhang
Jianfei Cai
Hamid Rezatofighi
Dacheng Tao
53
342
0
26 Nov 2021
SWAT: Spatial Structure Within and Among Tokens
SWAT: Spatial Structure Within and Among Tokens
Kumara Kahatapitiya
Michael S. Ryoo
25
6
0
26 Nov 2021
Mask Transfiner for High-Quality Instance Segmentation
Mask Transfiner for High-Quality Instance Segmentation
Lei Ke
Martin Danelljan
Xia Li
Yu-Wing Tai
Chi-Keung Tang
F. I. F. Richard Yu
ISeg
27
113
0
26 Nov 2021
A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
Himashi Peiris
Munawar Hayat
Zhaolin Chen
Gary Egan
Mehrtash Harandi
ViT
MedIm
14
123
0
26 Nov 2021
Scene Graph Generation with Geometric Context
Scene Graph Generation with Geometric Context
Vishal Kumar
Albert Mundu
S. Singh
GNN
3DV
17
2
0
25 Nov 2021
BoxeR: Box-Attention for 2D and 3D Transformers
BoxeR: Box-Attention for 2D and 3D Transformers
Duy-Kien Nguyen
Jihong Ju
Olaf Booji
Martin R. Oswald
Cees G. M. Snoek
ViT
28
36
0
25 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
35
73
0
25 Nov 2021
Exploiting Both Domain-specific and Invariant Knowledge via a Win-win
  Transformer for Unsupervised Domain Adaptation
Exploiting Both Domain-specific and Invariant Knowledge via a Win-win Transformer for Unsupervised Domain Adaptation
Wen-hui Ma
Jinming Zhang
Shuang Li
Chi Harold Liu
Yulin Wang
Wei Li
ViT
21
11
0
25 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
42
238
0
24 Nov 2021
Self-slimmed Vision Transformer
Self-slimmed Vision Transformer
Zhuofan Zong
Kunchang Li
Guanglu Song
Yali Wang
Yu Qiao
B. Leng
Yu Liu
ViT
21
30
0
24 Nov 2021
Conditional Object-Centric Learning from Video
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
41
214
0
24 Nov 2021
Lepard: Learning partial point cloud matching in rigid and deformable
  scenes
Lepard: Learning partial point cloud matching in rigid and deformable scenes
Yang Li
Tatsuya Harada
3DPC
34
120
0
24 Nov 2021
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
PU-Transformer: Point Cloud Upsampling Transformer
PU-Transformer: Point Cloud Upsampling Transformer
Shi Qiu
Saeed Anwar
Nick Barnes
3DPC
ViT
29
51
0
24 Nov 2021
Multiset-Equivariant Set Prediction with Approximate Implicit
  Differentiation
Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Yan Zhang
David W. Zhang
Simon Lacoste-Julien
Gertjan J. Burghouts
Cees G. M. Snoek
BDL
35
21
0
23 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language
  Modeling
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
27
111
0
23 Nov 2021
PhysFormer: Facial Video-based Physiological Measurement with Temporal
  Difference Transformer
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
Zitong Yu
Yuming Shen
Jingang Shi
Hengshuang Zhao
Philip H. S. Torr
Guoying Zhao
ViT
MedIm
137
167
0
23 Nov 2021
Multi-Person 3D Motion Prediction with Multi-Range Transformers
Multi-Person 3D Motion Prediction with Multi-Range Transformers
Jiashun Wang
Huazhe Xu
Medhini Narasimhan
Xiaolong Wang
ViT
40
73
0
23 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
31
40
0
23 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
Towards Tokenized Human Dynamics Representation
Towards Tokenized Human Dynamics Representation
Kenneth Li
Xiao Sun
Zhirong Wu
Fangyun Wei
Stephen Lin
16
2
0
22 Nov 2021
Class-agnostic Object Detection with Multi-modal Transformer
Class-agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
Rao Muhammad Anwer
Ming Yang
15
91
0
22 Nov 2021
Mesa: A Memory-saving Training Framework for Transformers
Mesa: A Memory-saving Training Framework for Transformers
Zizheng Pan
Peng Chen
Haoyu He
Jing Liu
Jianfei Cai
Bohan Zhuang
23
20
0
22 Nov 2021
Efficient Softmax Approximation for Deep Neural Networks with Attention
  Mechanism
Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism
Ihor Vasyltsov
Wooseok Chang
25
12
0
21 Nov 2021
FBNetV5: Neural Architecture Search for Multiple Tasks in One Run
FBNetV5: Neural Architecture Search for Multiple Tasks in One Run
Bichen Wu
Chaojian Li
Hang Zhang
Xiaoliang Dai
Peizhao Zhang
Matthew Yu
Jialiang Wang
Yingyan Lin
Peter Vajda
ViT
27
23
0
19 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
52
1,747
0
18 Nov 2021
Restormer: Efficient Transformer for High-Resolution Image Restoration
Restormer: Efficient Transformer for High-Resolution Image Restoration
Syed Waqas Zamir
Aditya Arora
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
49
2,127
0
18 Nov 2021
Achieving Human Parity on Visual Question Answering
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
24
12
0
17 Nov 2021
TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
Yuefeng Tao
Zhiwei Jia
Runze Ma
Shugong Xu
ViT
19
6
0
16 Nov 2021
Explainable Semantic Space by Grounding Language to Vision with
  Cross-Modal Contrastive Learning
Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning
Yizhen Zhang
Minkyu Choi
Kuan Han
Zhongming Liu
VLM
15
15
0
13 Nov 2021
Previous
123...959697...102103104
Next