Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.07477
Cited By
S
2
^2
2
-MLP: Spatial-Shift MLP Architecture for Vision
14 June 2021
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"S$^2$-MLP: Spatial-Shift MLP Architecture for Vision"
42 / 42 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
379
0
0
06 May 2025
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
72
0
0
01 Dec 2023
MVT: Multi-view Vision Transformer for 3D Object Recognition
Shuo Chen
Tan Yu
Ping Li
ViT
54
44
0
25 Oct 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
115
231
0
21 Jul 2021
AS-MLP: An Axial Shifted MLP Architecture for Vision
Dongze Lian
Zehao Yu
Xing Sun
Shenghua Gao
104
189
0
18 Jul 2021
Global Filter Networks for Image Classification
Yongming Rao
Wenliang Zhao
Zheng Zhu
Jiwen Lu
Jie Zhou
ViT
51
458
0
01 Jul 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
56
26
0
28 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
97
206
0
23 Jun 2021
Container: Context Aggregation Network
Peng Gao
Jiasen Lu
Hongsheng Li
Roozbeh Mottaghi
Aniruddha Kembhavi
ViT
52
71
0
02 Jun 2021
Pay Attention to MLPs
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
84
657
0
17 May 2021
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
68
657
0
07 May 2021
Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
Luke Melas-Kyriazi
ViT
25
102
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
51
480
0
05 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
381
2,638
0
04 May 2021
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
48
1,006
0
28 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
118
1,248
0
22 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
119
998
0
31 Mar 2021
Rethinking Spatial Dimensions of Vision Transformers
Byeongho Heo
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Junsuk Choe
Seong Joon Oh
ViT
475
573
0
30 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
114
1,891
0
29 Mar 2021
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
Pengchuan Zhang
Xiyang Dai
Jianwei Yang
Bin Xiao
Lu Yuan
Lei Zhang
Jianfeng Gao
ViT
59
332
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
324
21,175
0
25 Mar 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
362
1,544
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
450
3,678
0
24 Feb 2021
Conditional Positional Encodings for Vision Transformers
Xiangxiang Chu
Zhi Tian
Bo Zhang
Xinlong Wang
Chunhua Shen
ViT
62
610
0
22 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
99
1,918
0
28 Jan 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
300
6,657
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
397
40,217
0
22 Oct 2020
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
84
1,672
0
30 Mar 2020
4-Connected Shift Residual Networks
Andrew Brown
Pascal Mettes
Marcel Worring
3DPC
42
8
0
22 Oct 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
109
17,950
0
28 May 2019
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
78
1,677
0
20 Nov 2018
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
Bichen Wu
Alvin Wan
Xiangyu Yue
Peter H. Jin
Sicheng Zhao
Noah Golmant
A. Gholaminejad
Joseph E. Gonzalez
Kurt Keutzer
3DPC
56
363
0
22 Nov 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
VLM
110
2,386
0
10 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
453
129,831
0
12 Jun 2017
Depthwise Separable Convolutions for Neural Machine Translation
Lukasz Kaiser
Aidan Gomez
François Chollet
56
278
0
09 Jun 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
1.1K
20,747
0
17 Apr 2017
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
852
14,493
0
07 Oct 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
251
10,412
0
21 Jul 2016
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
159
4,958
0
27 Jun 2016
FractalNet: Ultra-Deep Neural Networks without Residuals
Gustav Larsson
Michael Maire
Gregory Shakhnarovich
108
937
0
24 May 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.4K
192,638
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
495
27,231
0
02 Dec 2015
1