Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00112
Cited By
Transformer in Transformer
27 February 2021
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformer in Transformer"
50 / 553 papers shown
Title
An Image Patch is a Wave: Phase-Aware Vision MLP
Yehui Tang
Kai Han
Jianyuan Guo
Chang Xu
Yanxi Li
Chao Xu
Yunhe Wang
24
133
0
24 Nov 2021
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer
Zitong Yu
Yuming Shen
Jingang Shi
Hengshuang Zhao
Philip Torr
Guoying Zhao
ViT
MedIm
140
167
0
23 Nov 2021
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
40
874
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
45
98
0
22 Nov 2021
DuDoTrans: Dual-Domain Transformer Provides More Attention for Sinogram Restoration in Sparse-View CT Reconstruction
Ce Wang
Kun Shang
Haimiao Zhang
Qian Li
Yuan Hui
S. Kevin Zhou
ViT
MedIm
11
28
0
21 Nov 2021
Are Vision Transformers Robust to Patch Perturbations?
Jindong Gu
Volker Tresp
Yao Qin
AAML
ViT
38
60
0
20 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
24
34
0
16 Nov 2021
Attention Mechanisms in Computer Vision: A Survey
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
19
1,636
0
15 Nov 2021
Searching for TrioNet: Combining Convolution with Local and Global Self-Attention
Huaijin Pi
Huiyu Wang
Yingwei Li
Zizhang Li
Alan Yuille
ViT
27
3
0
15 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
Sliced Recursive Transformer
Zhiqiang Shen
Zechun Liu
Eric P. Xing
ViT
22
27
0
09 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
85
97
0
07 Nov 2021
Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems
Wenqing Zheng
Qiangqiang Guo
H. Yang
Peihao Wang
Zhangyang Wang
AI4CE
14
12
0
29 Oct 2021
3D Object Tracking with Transformer
Yubo Cui
Zheng Fang
Jiayao Shan
Zuoxu Gu
Sifan Zhou
ViT
3DPC
26
60
0
28 Oct 2021
MVT: Multi-view Vision Transformer for 3D Object Recognition
Shuo Chen
Tan Yu
Ping Li
ViT
37
43
0
25 Oct 2021
HRFormer: High-Resolution Transformer for Dense Prediction
Yuhui Yuan
Rao Fu
Lang Huang
Weihong Lin
Chao Zhang
Xilin Chen
Jingdong Wang
ViT
38
227
0
18 Oct 2021
SpecTNT: a Time-Frequency Transformer for Music Audio
Weiyi Lu
Ju-Chiang Wang
Minz Won
Keunwoo Choi
Xuchen Song
ViT
25
45
0
18 Oct 2021
StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning
Jinghuan Shang
Kumara Kahatapitiya
Xiang Li
Michael S. Ryoo
OffRL
43
33
0
12 Oct 2021
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang
Hongxu Yin
Maying Shen
Pavlo Molchanov
Hai Helen Li
Jan Kautz
ViT
30
39
0
10 Oct 2021
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Philipp Benz
Soomin Ham
Chaoning Zhang
Adil Karjauv
In So Kweon
AAML
ViT
47
79
0
06 Oct 2021
Implicit and Explicit Attention for Zero-Shot Learning
Faisal Alamri
Anjan Dutta
90
7
0
02 Oct 2021
Seeking an Optimal Approach for Computer-Aided Pulmonary Embolism Detection
N. Islam
S. Gehlot
Zongwei Zhou
Michael B. Gotway
Jianming Liang
OOD
88
11
0
15 Sep 2021
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
Tongkun Xu
Weihua Chen
Pichao Wang
Fan Wang
Hao Li
Rong Jin
ViT
59
215
0
13 Sep 2021
Towards Transferable Adversarial Attacks on Vision Transformers
Zhipeng Wei
Jingjing Chen
Micah Goldblum
Zuxuan Wu
Tom Goldstein
Yu-Gang Jiang
ViT
AAML
24
112
0
09 Sep 2021
Scaled ReLU Matters for Training Vision Transformers
Pichao Wang
Xue Wang
Haowen Luo
Jingkai Zhou
Zhipeng Zhou
Fan Wang
Hao Li
Rong Jin
19
41
0
08 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
29
7
0
06 Sep 2021
Searching for Efficient Multi-Stage Vision Transformers
Yi-Lun Liao
S. Karaman
Vivienne Sze
ViT
18
19
0
01 Sep 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
46
105
0
30 Aug 2021
Exploring and Improving Mobile Level Vision Transformers
Pengguang Chen
Yixin Chen
Shu Liu
Ming Yang
Jiaya Jia
ViT
29
4
0
30 Aug 2021
Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net
Yu Qiu
Yun-Hai Liu
Le Zhang
Jing Xu
ViT
21
30
0
17 Aug 2021
Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers
B. Dong
Wenhai Wang
Deng-Ping Fan
Jinpeng Li
Huazhu Fu
Ling Shao
ViT
MedIm
31
316
0
16 Aug 2021
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?
Yuki Tatsunami
Masato Taki
39
12
0
09 Aug 2021
S
2
^2
2
-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
39
50
0
02 Aug 2021
Congested Crowd Instance Localization with Dilated Convolutional Swin Transformer
Junyuan Gao
Maoguo Gong
Xuelong Li
ViT
24
46
0
02 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
34
258
0
31 Jul 2021
Multi-Head Self-Attention via Vision Transformer for Zero-Shot Learning
Faisal Alamri
Anjan Dutta
ViT
24
23
0
30 Jul 2021
DPT: Deformable Patch-based Transformer for Visual Recognition
Zhiyang Chen
Yousong Zhu
Chaoyang Zhao
Guosheng Hu
Wei Zeng
Jinqiao Wang
Ming Tang
ViT
16
98
0
30 Jul 2021
Contextual Transformer Networks for Visual Recognition
Yehao Li
Ting Yao
Yingwei Pan
Tao Mei
ViT
22
468
0
26 Jul 2021
AS-MLP: An Axial Shifted MLP Architecture for Vision
Dongze Lian
Zehao Yu
Xing Sun
Shenghua Gao
28
189
0
18 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip Torr
50
27
0
13 Jul 2021
What Makes for Hierarchical Vision Transformer?
Yuxin Fang
Xinggang Wang
Rui Wu
Wenyu Liu
ViT
18
9
0
05 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
33
68
0
03 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
25
956
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
Augmented Shortcuts for Vision Transformers
Yehui Tang
Kai Han
Chang Xu
An Xiao
Yiping Deng
Chao Xu
Yunhe Wang
ViT
19
39
0
30 Jun 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
45
26
0
28 Jun 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
56
326
0
27 Jun 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
50
1,615
0
25 Jun 2021
ViTAS: Vision Transformer Architecture Search
Xiu Su
Shan You
Jiyang Xie
Mingkai Zheng
Fei Wang
Chao Qian
Changshui Zhang
Xiaogang Wang
Chang Xu
ViT
27
54
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
52
314
0
24 Jun 2021
Previous
1
2
3
...
10
11
12
9
Next