ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00112
  4. Cited By
Transformer in Transformer
v1v2v3 (latest)

Transformer in Transformer

27 February 2021
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
    ViT
ArXiv (abs)PDFHTMLGithub (4228★)

Papers citing "Transformer in Transformer"

50 / 558 papers shown
Title
Learning Multiscale Transformer Models for Sequence Generation
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
70
9
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
121
35
0
19 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Lei Zhang
Margret Keuper
Xia Hua
ViT
71
15
0
15 Jun 2022
Scaling Vision Transformers to Gigapixel Images via Hierarchical
  Self-Supervised Learning
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
Richard J. Chen
Chengkuan Chen
Yicong Li
Tiffany Y. Chen
A. Trister
Rahul G. Krishnan
Faisal Mahmood
ViTMedIm
140
432
0
06 Jun 2022
Which models are innately best at uncertainty estimation?
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
79
5
0
05 Jun 2022
Federated Adversarial Training with Transformers
Federated Adversarial Training with Transformers
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
FedMLViT
83
2
0
05 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViTOODMedIm
175
47
0
02 Jun 2022
Vision GNN: An Image is Worth Graph of Nodes
Vision GNN: An Image is Worth Graph of Nodes
Kai Han
Yunhe Wang
Jianyuan Guo
Yehui Tang
Enhua Wu
GNN3DH
119
377
0
01 Jun 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on
  Small Datasets
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViTMedIm
72
11
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
106
29
0
30 May 2022
Fast Vision Transformers with HiLo Attention
Fast Vision Transformers with HiLo Attention
Zizheng Pan
Jianfei Cai
Bohan Zhuang
67
168
0
26 May 2022
UMSNet: An Universal Multi-sensor Network for Human Activity Recognition
UMSNet: An Universal Multi-sensor Network for Human Activity Recognition
Jialiang Wang
Hao Wei
Yi Wang
Shujia Yang
Chi Li
HAI
53
1
0
24 May 2022
Super Vision Transformer
Super Vision Transformer
Mingbao Lin
Mengzhao Chen
Yuxin Zhang
Yunhang Shen
Rongrong Ji
Liujuan Cao
ViT
135
21
0
23 May 2022
FedAdapter: Efficient Federated Learning for Modern NLP
FedAdapter: Efficient Federated Learning for Modern NLP
Dongqi Cai
Yaozong Wu
Shangguang Wang
F. Lin
Mengwei Xu
FedMLAI4CE
74
23
0
20 May 2022
TRT-ViT: TensorRT-oriented Vision Transformer
TRT-ViT: TensorRT-oriented Vision Transformer
Xin Xia
Jiashi Li
Jie Wu
Xing Wang
Xuefeng Xiao
Min Zheng
Rui Wang
ViT
64
28
0
19 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
182
572
0
17 May 2022
HoVer-Trans: Anatomy-aware HoVer-Transformer for ROI-free Breast Cancer
  Diagnosis in Ultrasound Images
HoVer-Trans: Anatomy-aware HoVer-Transformer for ROI-free Breast Cancer Diagnosis in Ultrasound Images
Y. Mo
Chu Han
Yu Liu
Min Liu
Zhenwei Shi
...
Zeyan Xu
Xiaomei Huang
Zaiyi Liu
Ying Wang
C. Liang
ViTMedIm
109
56
0
17 May 2022
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
Haoran You
Baopu Li
Huihong Shi
Y. Fu
Yingyan Lin
122
17
0
17 May 2022
Row-wise Accelerator for Vision Transformer
Row-wise Accelerator for Vision Transformer
Hong-Yi Wang
Tian-Sheuan Chang
68
16
0
09 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
79
128
0
08 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
  Transformers
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
100
198
0
06 May 2022
Application of belief functions to medical image segmentation: A review
Application of belief functions to medical image segmentation: A review
Ling Huang
S. Ruan
Thierry Denoeux
EDLMedIm
86
31
0
03 May 2022
Visualizing and Explaining Language Models
Visualizing and Explaining Language Models
Adrian M. P. Braşoveanu
Razvan Andonie
MILMVLM
111
5
0
30 Apr 2022
Generative Adversarial Networks for Image Super-Resolution: A Survey
Generative Adversarial Networks for Image Super-Resolution: A Survey
Chunwei Tian
Xuanyu Zhang
Chun-Wei Lin
W. Zuo
Yanning Zhang
Chia-Wen Lin
GAN
88
45
0
28 Apr 2022
Adaptive Split-Fusion Transformer
Adaptive Split-Fusion Transformer
Zixuan Su
Hao Zhang
Jingjing Chen
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
96
8
0
26 Apr 2022
Residual Mixture of Experts
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
117
37
0
20 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
100
57
0
18 Apr 2022
VDTR: Video Deblurring with Transformer
VDTR: Video Deblurring with Transformer
Ming Cao
Yanbo Fan
Yong Zhang
Jue Wang
Yujiu Yang
ViT
66
41
0
17 Apr 2022
Safe Self-Refinement for Transformer-based Domain Adaptation
Safe Self-Refinement for Transformer-based Domain Adaptation
Tao Sun
Cheng Lu
Tianshuo Zhang
Haibin Ling
ViT
65
87
0
16 Apr 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
114
127
0
14 Apr 2022
DeiT III: Revenge of the ViT
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
129
418
0
14 Apr 2022
Points to Patches: Enabling the Use of Self-Attention for 3D Shape
  Recognition
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Axel Berg
Magnus Oskarsson
Mark O'Connor
3DPCViT
82
27
0
08 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
169
255
0
07 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
107
11
0
07 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDEViT
88
109
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
163
676
0
04 Apr 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
89
66
0
29 Mar 2022
Shifting More Attention to Visual Backbone: Query-modulated Refinement
  Networks for End-to-End Visual Grounding
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding
Jiabo Ye
Junfeng Tian
Ming Yan
Xiaoshan Yang
Xuwu Wang
Ji Zhang
Liang He
Xin Lin
ObjD
88
66
0
29 Mar 2022
SepViT: Separable Vision Transformer
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
113
42
0
29 Mar 2022
CD-Net: Histopathology Representation Learning using Pyramidal
  Context-Detail Network
CD-Net: Histopathology Representation Learning using Pyramidal Context-Detail Network
S. Kapse
Srijan Das
Prateek Prasanna
60
5
0
28 Mar 2022
Brain-inspired Multilayer Perceptron with Spiking Neurons
Brain-inspired Multilayer Perceptron with Spiking Neurons
Wenshuo Li
Hanting Chen
Jianyuan Guo
Ziyang Zhang
Yunhe Wang
75
36
0
28 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
Fan Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViTMedIm
120
28
0
24 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
101
36
0
24 Mar 2022
Training-free Transformer Architecture Search
Training-free Transformer Architecture Search
Qinqin Zhou
Kekai Sheng
Xiawu Zheng
Ke Li
Xing Sun
Yonghong Tian
Jie Chen
Rongrong Ji
ViT
85
48
0
23 Mar 2022
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger
Thomas Paniagua
Xi Song
Naresh P. Cuntoor
Mun Wai Lee
Tianfu Wu
ViT
59
11
0
22 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
160
57
0
21 Mar 2022
GroupTransNet: Group Transformer Network for RGB-D Salient Object
  Detection
GroupTransNet: Group Transformer Network for RGB-D Salient Object Detection
Xian Fang
Jin-lei Zhu
Xiuli Shao
Hongpeng Wang
ViT
81
14
0
21 Mar 2022
Vision Transformer with Convolutions Architecture Search
Vision Transformer with Convolutions Architecture Search
Haichao Zhang
K. Hao
Witold Pedrycz
Lei Gao
Xue-song Tang
Bing Wei
ViT
35
6
0
20 Mar 2022
Three things everyone should know about Vision Transformers
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
121
123
0
18 Mar 2022
VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial
  Attention
VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
Sheng Deng
Zhihao Liang
Lin Sun
Kui Jia
3DPC
58
77
0
18 Mar 2022
Previous
123...101112789
Next