ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00112
  4. Cited By
Transformer in Transformer

Transformer in Transformer

27 February 2021
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
    ViT
ArXivPDFHTML

Papers citing "Transformer in Transformer"

50 / 553 papers shown
Title
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for
  Transformers
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
31
35
0
14 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
28
57
0
12 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
34
3
0
09 Oct 2022
The Lie Derivative for Measuring Learned Equivariance
The Lie Derivative for Measuring Learned Equivariance
Nate Gruver
Marc Finzi
Micah Goldblum
A. Wilson
18
35
0
06 Oct 2022
Towards Flexible Inductive Bias via Progressive Reparameterization
  Scheduling
Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
Yunsung Lee
Gyuseong Lee
Kwang-seok Ryoo
Hyojun Go
Jihye Park
Seung Wook Kim
32
5
0
04 Oct 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
UNesT: Local Spatial Representation Learning with Hierarchical
  Transformer for Efficient Medical Segmentation
UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation
Xin Yu
Qi Yang
Yinchi Zhou
L. Cai
Riqiang Gao
...
R. Abramson
Zizhao Zhang
Yuankai Huo
Bennett A. Landman
Yucheng Tang
ViT
MedIm
42
0
0
28 Sep 2022
Hierarchical MixUp Multi-label Classification with Imbalanced Interdisciplinary Research Proposals
Meng Xiao
Minjie Wu
Ziyue Qiao
Zhiyuan Ning
Yi Du
Yanjie Fu
Yuanchun Zhou
31
2
0
28 Sep 2022
Dense-TNT: Efficient Vehicle Type Classification Neural Network Using
  Satellite Imagery
Dense-TNT: Efficient Vehicle Type Classification Neural Network Using Satellite Imagery
Ruikang Luo
Yaofeng Song
Haiying Zhao
Yicheng Zhang
Yi Zhang
Nanbin Zhao
Liping Huang
Rong Su
ViT
16
11
0
27 Sep 2022
Estimating Brain Age with Global and Local Dependencies
Estimating Brain Age with Global and Local Dependencies
Yanwu Yang
Xutao Guo
Zhikai Chang
Chenfei Ye
Yang Xiang
Haiyan Lv
Ting Ma
19
0
0
19 Sep 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document
  Understanding
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang
Zhengjie Huang
Bin Luo
Qianglong Chen
Qiming Peng
...
Weichong Yin
Shi Feng
Yu Sun
Dianhai Yu
Yin Zhang
ViT
35
11
0
18 Sep 2022
Hierarchical Interdisciplinary Topic Detection Model for Research
  Proposal Classification
Hierarchical Interdisciplinary Topic Detection Model for Research Proposal Classification
Meng Xiao
Ziyue Qiao
Yanjie Fu
Hao Dong
Yi Du
Pengyang Wang
Hui Xiong
Yuanchun Zhou
34
10
0
16 Sep 2022
SQ-Swin: a Pretrained Siamese Quadratic Swin Transformer for Lettuce
  Browning Prediction
SQ-Swin: a Pretrained Siamese Quadratic Swin Transformer for Lettuce Browning Prediction
Dayang Wang
Boce Zhang
Yongshun Xu
Yaguang Luo
Hengyong Yu
ViT
27
1
0
16 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for
  Vision Transformers
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
43
33
0
13 Sep 2022
Not All Instances Contribute Equally: Instance-adaptive Class
  Representation Learning for Few-Shot Visual Recognition
Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition
M. Han
Yibing Zhan
Yong Luo
Bo Du
Han Hu
Yonggang Wen
Dacheng Tao
19
6
0
07 Sep 2022
ViTKD: Practical Guidelines for ViT feature knowledge distillation
ViTKD: Practical Guidelines for ViT feature knowledge distillation
Zhendong Yang
Zhe Li
Ailing Zeng
Zexian Li
Chun Yuan
Yu Li
88
42
0
06 Sep 2022
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted
  Window
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window
Mocho Go
Hideyuki Tachibana
ViT
37
9
0
24 Aug 2022
FocusFormer: Focusing on What We Need via Architecture Sampler
FocusFormer: Focusing on What We Need via Architecture Sampler
Jing Liu
Jianfei Cai
Bohan Zhuang
35
7
0
23 Aug 2022
Exploring Adversarial Robustness of Vision Transformers in the Spectral
  Perspective
Exploring Adversarial Robustness of Vision Transformers in the Spectral Perspective
Gihyun Kim
Juyeop Kim
Jong-Seok Lee
AAML
ViT
24
4
0
20 Aug 2022
Multiple Instance Neuroimage Transformer
Multiple Instance Neuroimage Transformer
Ayush Singla
Qingyu Zhao
Daniel K. Do
Yuyin Zhou
K. Pohl
Ehsan Adeli
ViT
MedIm
24
11
0
19 Aug 2022
Improved Image Classification with Token Fusion
Improved Image Classification with Token Fusion
Keong-Hun Choi
Jin-Woo Kim
Yaolong Wang
J. Ha
ViT
19
0
0
19 Aug 2022
HaloAE: An HaloNet based Local Transformer Auto-Encoder for Anomaly
  Detection and Localization
HaloAE: An HaloNet based Local Transformer Auto-Encoder for Anomaly Detection and Localization
É. Mathian
H. Liu
L. Fernandez-Cuesta
Dimitris Samaras
M. Foll
L. Chen
ViT
33
12
0
06 Aug 2022
TransMatting: Enhancing Transparent Objects Matting with Transformers
TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai
Fanglei Xue
Lele Xu
Lili Guo
ViT
11
20
0
05 Aug 2022
DropKey
DropKey
Bonan li
Yinhan Hu
Xuecheng Nie
Congying Han
Xiangjian Jiang
Tiande Guo
Luoqi Liu
20
11
0
04 Aug 2022
Computer Vision Methods for the Microstructural Analysis of Materials:
  The State-of-the-art and Future Perspectives
Computer Vision Methods for the Microstructural Analysis of Materials: The State-of-the-art and Future Perspectives
Khaled Alrfou
Amir Kordijazi
Tian Zhao
3DV
42
6
0
29 Jul 2022
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
Cong Wang
Hongmin Xu
Xiong Zhang
Li Wang
Zhitong Zheng
Haifeng Liu
ViT
20
20
0
27 Jul 2022
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for
  Action Recognition
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Xiangpei Hua
Lei Zhang
ViT
30
27
0
27 Jul 2022
TransCL: Transformer Makes Strong and Flexible Compressive Learning
TransCL: Transformer Makes Strong and Flexible Compressive Learning
Chong Mou
Jian Zhang
11
24
0
25 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
27
7
0
19 Jul 2022
Multi-manifold Attention for Vision Transformers
Multi-manifold Attention for Vision Transformers
D. Konstantinidis
Ilias Papastratis
K. Dimitropoulos
P. Daras
ViT
27
16
0
18 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System
  Forecasting
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
42
150
0
12 Jul 2022
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation
  Learning
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao
Yingwei Pan
Yehao Li
Chong-Wah Ngo
Tao Mei
ViT
154
137
0
11 Jul 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
154
75
0
11 Jul 2022
MaiT: Leverage Attention Masks for More Efficient Image Transformers
MaiT: Leverage Attention Masks for More Efficient Image Transformers
Ling Li
Ali Shafiee Ardestani
Joseph Hassoun
14
1
0
06 Jul 2022
Dynamic Spatial Sparsification for Efficient Vision Transformers and
  Convolutional Neural Networks
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
Yongming Rao
Zuyan Liu
Wenliang Zhao
Jie Zhou
Jiwen Lu
ViT
44
36
0
04 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
57
95
0
04 Jul 2022
Revisiting Classifier: Transferring Vision-Language Models for Video
  Recognition
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
103
93
0
04 Jul 2022
A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap
  between Weak Supervision and Dense Prediction
A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction
Wei Shen
Zelin Peng
Xuehui Wang
Huayu Wang
Jiazhong Cen
Dongsheng Jiang
Lingxi Xie
Xiaokang Yang
Qi Tian
VLM
19
77
0
04 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
36
0
0
01 Jul 2022
PVT-COV19D: Pyramid Vision Transformer for COVID-19 Diagnosis
PVT-COV19D: Pyramid Vision Transformer for COVID-19 Diagnosis
Lilang Zheng
Jiaxuan Fang
Xiaorun Tang
Hanzhang Li
Jiaxin Fan
Tianyi Wang
Rui Zhou
Zhaoyan Yan
ViT
MedIm
31
2
0
30 Jun 2022
Dynamic-Group-Aware Networks for Multi-Agent Trajectory Prediction with
  Relational Reasoning
Dynamic-Group-Aware Networks for Multi-Agent Trajectory Prediction with Relational Reasoning
Chenxin Xu
Yuxin Wei
Bohan Tang
Sheng Yin
Ya Zhang
Siheng Chen
AI4TS
AI4CE
32
33
0
27 Jun 2022
CV 3315 Is All You Need : Semantic Segmentation Competition
CV 3315 Is All You Need : Semantic Segmentation Competition
Akide Liu
Zihan Wang
35
4
0
25 Jun 2022
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens
  in 3D Space
Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space
Jinghuan Shang
Srijan Das
Michael S. Ryoo
44
13
0
23 Jun 2022
Vicinity Vision Transformer
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
42
31
0
21 Jun 2022
Learning Multiscale Transformer Models for Sequence Generation
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
32
9
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Lei Zhang
M. Keuper
Xia Hua
ViT
37
15
0
15 Jun 2022
Scaling Vision Transformers to Gigapixel Images via Hierarchical
  Self-Supervised Learning
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
Richard J. Chen
Chengkuan Chen
Yicong Li
Tiffany Y. Chen
A. Trister
Rahul G. Krishnan
Faisal Mahmood
ViT
MedIm
34
407
0
06 Jun 2022
Which models are innately best at uncertainty estimation?
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
34
5
0
05 Jun 2022
Federated Adversarial Training with Transformers
Federated Adversarial Training with Transformers
Ahmed Aldahdooh
W. Hamidouche
Olivier Déforges
FedML
ViT
25
2
0
05 Jun 2022
Previous
123...678...101112
Next