ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.12944
  4. Cited By
Transformers Meet Visual Learning Understanding: A Comprehensive Review

Transformers Meet Visual Learning Understanding: A Comprehensive Review

24 March 2022
Yuting Yang
Licheng Jiao
Xuantong Liu
Fan Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
    ViTMedIm
ArXiv (abs)PDFHTML

Papers citing "Transformers Meet Visual Learning Understanding: A Comprehensive Review"

50 / 108 papers shown
Title
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer
Wenxuan Wang
Chen Chen
Meng Ding
Jiangyun Li
Hong Yu
Sen Zha
ViTMedIm
94
729
0
07 Mar 2021
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
  Segmentation
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation
Yutong Xie
Jianpeng Zhang
Chunhua Shen
Yong-quan Xia
ViTMedIm
86
501
0
04 Mar 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
391
1,574
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
535
3,734
0
24 Feb 2021
Conditional Positional Encodings for Vision Transformers
Conditional Positional Encodings for Vision Transformers
Xiangxiang Chu
Zhi Tian
Bo Zhang
Xinlong Wang
Chunhua Shen
ViT
87
620
0
22 Feb 2021
Medical Transformer: Gated Axial-Attention for Medical Image
  Segmentation
Medical Transformer: Gated Axial-Attention for Medical Image Segmentation
Jeya Maria Jose Valanarasu
Poojan Oza
Ilker Hacihaliloglu
Vishal M. Patel
ViTMedIm
108
993
0
21 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
391
2,064
0
09 Feb 2021
TransUNet: Transformers Make Strong Encoders for Medical Image
  Segmentation
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Jieneng Chen
Yongyi Lu
Qihang Yu
Xiangde Luo
Ehsan Adeli
Yan Wang
Le Lu
Alan Yuille
Yuyin Zhou
ViTMedIm
98
3,499
0
08 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
266
433
0
01 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
146
1,942
0
28 Jan 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
360
994
0
27 Jan 2021
TrackFormer: Multi-Object Tracking with Transformers
TrackFormer: Multi-Object Tracking with Transformers
Tim Meinhardt
A. Kirillov
Laura Leal-Taixe
Christoph Feichtenhofer
VOT
276
774
0
07 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
307
2,532
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
194
2,911
0
31 Dec 2020
TransTrack: Multiple Object Tracking with Transformer
TransTrack: Multiple Object Tracking with Transformer
Pei Sun
Jinkun Cao
Yi Jiang
Rufeng Zhang
Enze Xie
Zehuan Yuan
Changhu Wang
Ping Luo
ViTVOT
312
585
0
31 Dec 2020
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
389
6,805
0
23 Dec 2020
FcaNet: Frequency Channel Attention Networks
FcaNet: Frequency Channel Attention Networks
Zequn Qin
Pengyi Zhang
Leilei Gan
Xi Li
93
711
0
22 Dec 2020
Toward Transformer-Based Object Detection
Toward Transformer-Based Object Detection
Josh Beal
Eric Kim
Eric Tzeng
Dong Huk Park
Andrew Zhai
Dmitry Kislyuk
ViT
91
215
0
17 Dec 2020
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
128
531
0
01 Dec 2020
Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection
  in Autonomous Driving
Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving
Zhenxun Yuan
Xiao-yang Song
Lei Bai
Wen-gang Zhou
Zhe Wang
Wanli Ouyang
ViT
77
131
0
27 Nov 2020
Rethinking Transformer-based Set Prediction for Object Detection
Rethinking Transformer-based Set Prediction for Object Detection
Zhiqing Sun
Shengcao Cao
Yiming Yang
Kris Kitani
ViT
123
323
0
21 Nov 2020
End-to-End Object Detection with Adaptive Clustering Transformer
End-to-End Object Detection with Adaptive Clustering Transformer
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
ViT
156
199
0
18 Nov 2020
RelationNet++: Bridging Visual Representations for Object Detection via
  Transformer Decoder
RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
Cheng Chi
Fangyun Wei
Han Hu
ViT
74
64
0
29 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
676
41,483
0
22 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
246
5,102
0
08 Oct 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
166
1,128
0
14 Sep 2020
A Universal Representation Transformer Layer for Few-Shot Image
  Classification
A Universal Representation Transformer Layer for Few-Shot Image Classification
Lu Liu
William L. Hamilton
Guodong Long
Jing Jiang
Hugo Larochelle
ViT
98
127
0
21 Jun 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT3DVPINN
440
13,130
0
26 May 2020
PolyTransform: Deep Polygon Transformer for Instance Segmentation
PolyTransform: Deep Polygon Transformer for Instance Segmentation
Justin Liang
N. Homayounfar
Wei-Chiu Ma
Yuwen Xiong
Rui Hu
R. Urtasun
ViTISeg
88
176
0
05 Dec 2019
GhostNet: More Features from Cheap Operations
GhostNet: More Features from Cheap Operations
Kai Han
Yunhe Wang
Qi Tian
Jianyuan Guo
Chunjing Xu
Chang Xu
99
2,690
0
27 Nov 2019
Gated Channel Transformation for Visual Recognition
Gated Channel Transformation for Visual Recognition
Zongxin Yang
Linchao Zhu
Yu Wu
Yezhou Yang
ViT
56
209
0
25 Sep 2019
Deep High-Resolution Representation Learning for Visual Recognition
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
393
3,630
0
20 Aug 2019
A Survey of Deep Learning-based Object Detection
A Survey of Deep Learning-based Object Detection
L. Jiao
Fan Zhang
Fang Liu
Shuyuan Yang
Lingling Li
Zhixi Feng
Rong Qu
ObjD
85
970
0
11 Jul 2019
Stand-Alone Self-Attention in Vision Models
Stand-Alone Self-Attention in Vision Models
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLMSLRViT
104
1,216
0
13 Jun 2019
CondConv: Conditionally Parameterized Convolutions for Efficient
  Inference
CondConv: Conditionally Parameterized Convolutions for Efficient Inference
Brandon Yang
Gabriel Bender
Quoc V. Le
Jiquan Ngiam
MedIm3DV
82
637
0
10 Apr 2019
An Attentive Survey of Attention Models
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
149
662
0
05 Apr 2019
SRM : A Style-based Recalibration Module for Convolutional Neural
  Networks
SRM : A Style-based Recalibration Module for Convolutional Neural Networks
HyunJae Lee
Hyo-Eun Kim
Hyeonseob Nam
61
225
0
26 Mar 2019
STNReID : Deep Convolutional Networks with Pairwise Spatial Transformer
  Networks for Partial Person Re-identification
STNReID : Deep Convolutional Networks with Pairwise Spatial Transformer Networks for Partial Person Re-identification
Hao Luo
Xing Fan
Chi Zhang
Wei Jiang
ViT
87
104
0
17 Mar 2019
Selective Kernel Networks
Selective Kernel Networks
Xiang Li
Wenhai Wang
Xiaolin Hu
Jian Yang
94
2,038
0
15 Mar 2019
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
136
709
0
06 Dec 2018
Global Second-order Pooling Convolutional Networks
Global Second-order Pooling Convolutional Networks
Zilin Gao
Jiangtao Xie
Qilong Wang
P. Li
74
335
0
29 Nov 2018
Gather-Excite: Exploiting Feature Context in Convolutional Neural
  Networks
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Andrea Vedaldi
73
577
0
29 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,229
0
11 Oct 2018
A Short Note about Kinetics-600
A Short Note about Kinetics-600
João Carreira
Eric Noland
Andras Banki-Horvath
Chloe Hillier
Andrew Zisserman
89
528
0
03 Aug 2018
Unified Perceptual Parsing for Scene Understanding
Unified Perceptual Parsing for Scene Understanding
Tete Xiao
Yingcheng Liu
Bolei Zhou
Yuning Jiang
Jian Sun
OCLVOS
197
1,899
0
26 Jul 2018
Object Detection with Deep Learning: A Review
Object Detection with Deep Learning: A Review
Zhong-Qiu Zhao
Peng Zheng
Shou-tao Xu
Xindong Wu
ObjD
130
4,017
0
15 Jul 2018
Applications of Deep Learning and Reinforcement Learning to Biological
  Data
Applications of Deep Learning and Reinforcement Learning to Biological Data
M. S. M. Mahmud
M. S. Kaiser
Amir Hussain
S. Vassanelli
OffRLAI4CE
67
645
0
10 Nov 2017
Squeeze-and-Excitation Networks
Squeeze-and-Excitation Networks
Jie Hu
Li Shen
Samuel Albanie
Gang Sun
Enhua Wu
427
26,568
0
05 Sep 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
VLM
207
2,407
0
10 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
792
132,454
0
12 Jun 2017
Previous
123
Next