Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.12731
Cited By
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
23 March 2021
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Local Self-Attention for Parameter Efficient Visual Backbones"
50 / 96 papers shown
Title
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
152
616
0
31 Dec 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
52
2
0
22 Jul 2024
Improving ensemble extreme precipitation forecasts using generative artificial intelligence
Yingkai Sha
Ryan Sobash
David John Gagne II
33
0
0
05 Jul 2024
Towards Scalable and Versatile Weight Space Learning
Konstantin Schurholt
Michael W. Mahoney
Damian Borth
50
15
0
14 Jun 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
50
3
0
28 May 2024
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Zizhao Hu
Mohammad Rostami
34
0
0
25 May 2024
A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning
Yuelin Zhang
Pengyu Zheng
Wanquan Yan
Chengyu Fang
Shing Shin Cheng
MedIm
37
7
0
05 Mar 2024
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
46
44
0
04 Mar 2024
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
47
0
0
01 Dec 2023
CNN Injected Transformer for Image Exposure Correction
Shuning Xu
Xiangyu Chen
Binbin Song
Jiantao Zhou
ViT
19
6
0
08 Sep 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
42
3
0
18 Aug 2023
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
37
2
0
07 Aug 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
29
12
0
22 May 2023
Inductive biases in deep learning models for weather prediction
Jannik Thümmel
Matthias Karlbauer
S. Otte
C. Zarfl
Georg Martius
...
Thomas Scholten
Ulrich Friedrich
V. Wulfmeyer
B. Goswami
Martin Volker Butz
AI4CE
43
5
0
06 Apr 2023
Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
Jun Yang
Lizhi Bai
Yaoru Sun
Chunqi Tian
Maoyu Mao
Guorun Wang
SSeg
25
16
0
23 Feb 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
18
13
0
20 Feb 2023
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
35
50
0
13 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
37
8
0
03 Feb 2023
Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
L. Xiao
T. Yamasaki
AI4TS
21
2
0
27 Dec 2022
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
28
2
0
29 Nov 2022
FsaNet: Frequency Self-attention for Semantic Segmentation
Fengyu Zhang
Ashkan Panahi
Guangjun Gao
AI4TS
32
28
0
28 Nov 2022
Semantic-Aware Local-Global Vision Transformer
Jiatong Zhang
Zengwei Yao
Fanglin Chen
Guangming Lu
Wenjie Pei
ViT
25
0
0
27 Nov 2022
Spatial Mixture-of-Experts
Nikoli Dryden
Torsten Hoefler
MoE
34
9
0
24 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
34
129
0
22 Nov 2022
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
34
34
0
18 Nov 2022
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang
Jifeng Dai
Zhe Chen
Zhenhang Huang
Zhiqi Li
...
Tong Lu
Lewei Lu
Hongsheng Li
Xiaogang Wang
Yu Qiao
VLM
38
660
0
10 Nov 2022
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers
Zhuo Huang
Zhiyou Zhao
Banghuai Li
Jungong Han
3DPC
ViT
35
55
0
23 Oct 2022
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
46
156
0
19 Oct 2022
DCANet: Differential Convolution Attention Network for RGB-D Semantic Segmentation
Lizhi Bai
Jun Yang
Chunqi Tian
Yaoru Sun
Maoyu Mao
Yanjun Xu
Weirong Xu
21
9
0
13 Oct 2022
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
31
150
0
05 Oct 2022
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Chenglin Yang
Siyuan Qiao
Qihang Yu
Xiaoding Yuan
Yukun Zhu
Alan Yuille
Hartwig Adam
Liang-Chieh Chen
ViT
MoE
39
58
0
04 Oct 2022
Verifiable and Energy Efficient Medical Image Analysis with Quantised Self-attentive Deep Neural Networks
Rakshith Sathish
S. Khare
Debdoot Sheet
39
4
0
30 Sep 2022
Dilated Neighborhood Attention Transformer
Ali Hassani
Humphrey Shi
ViT
MedIm
33
68
0
29 Sep 2022
Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration
Marcos V. Conde
Ui-Jin Choi
Maxime Burchi
Radu Timofte
ViT
59
135
0
22 Sep 2022
Axially Expanded Windows for Local-Global Interaction in Vision Transformers
Zhemin Zhang
Xun Gong
ViT
18
1
0
19 Sep 2022
MRL: Learning to Mix with Attention and Convolutions
Shlok Mohta
Hisahiro Suganuma
Yoshiki Tanaka
28
2
0
30 Aug 2022
DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-Transformer
Hao Li
Zhijing Yang
Xiaobin Hong
Ziying Zhao
Junyang Chen
Yukai Shi
Jin-shan Pan
DiffM
ViT
43
11
0
28 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
30
33
0
18 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
39
149
0
12 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
Recurrent Video Restoration Transformer with Guided Deformable Attention
Christos Sakaridis
Yuchen Fan
Xiaoyu Xiang
Rakesh Ranjan
Eddy Ilg
Simon Green
Jingyun Liang
Kaicheng Zhang
Radu Timofte
Luc Van Gool
42
152
0
05 Jun 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
64
26
0
30 May 2022
HCFormer: Unified Image Segmentation with Hierarchical Clustering
Teppei Suzuki
27
0
0
20 May 2022
Dense residual Transformer for image denoising
Chao Yao
Shuo Jin
Meiqin Liu
Xiaojuan Ban
ViT
36
29
0
14 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
79
602
0
09 May 2022
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
36
253
0
14 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
51
242
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
31
55
0
06 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
31
103
0
06 Apr 2022
1
2
Next