Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.12786
Cited By
ELSA: Enhanced Local Self-Attention for Vision Transformer
23 December 2021
Jingkai Zhou
Pichao Wang
Fan Wang
Qiong Liu
Hao Li
Rong Jin
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ELSA: Enhanced Local Self-Attention for Vision Transformer"
30 / 30 papers shown
Title
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
F. Wang
Mike Zheng Shou
ViT
34
14
0
15 Sep 2023
Frequency Disentangled Features in Neural Image Compression
Ali Zafari
Atefeh Khoshkhahtinat
P. Mehta
Mohammad Saeed Ebrahimi Saadabadi
Mohammad Akyash
Nasser M. Nasrabadi
42
14
0
04 Aug 2023
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
Mingyu Ding
Yikang Shen
Lijie Fan
Zhenfang Chen
Z. Chen
Ping Luo
J. Tenenbaum
Chuang Gan
ViT
79
14
0
06 Apr 2023
Making Vision Transformers Efficient from A Token Sparsification View
Shuning Chang
Pichao Wang
Ming Lin
Fan Wang
David Junhao Zhang
Rong Jin
Mike Zheng Shou
ViT
43
24
0
15 Mar 2023
Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm
Hengyuan Zhao
Hao Luo
Yuyang Zhao
Pichao Wang
F. Wang
Mike Zheng Shou
15
5
0
14 Mar 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
23
8
0
03 Feb 2023
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
24
2
0
20 Dec 2022
ViTPose++: Vision Transformer for Generic Body Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
34
40
0
07 Dec 2022
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
26
31
0
21 Oct 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling
Rui Wang
Zuxuan Wu
Dongdong Chen
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Luowei Zhou
Lu Yuan
Yu-Gang Jiang
ViT
35
4
0
25 Aug 2022
Conviformers: Convolutionally guided Vision Transformer
Mohit Vaishnav
Thomas Fel
I. F. Rodriguez
Thomas Serre
ViT
30
1
0
17 Aug 2022
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
F. Khan
ViT
27
184
0
21 Jun 2022
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
26
512
0
26 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
194
1,212
0
05 Oct 2021
SDTP: Semantic-aware Decoupled Transformer Pyramid for Dense Image Prediction
Zekun Li
Yufan Liu
Bing Li
Weiming Hu
Kebin Wu
Chengwei Peng
ViT
25
22
0
18 Sep 2021
Scaled ReLU Matters for Training Vision Transformers
Pichao Wang
Xue Wang
Haowen Luo
Jingkai Zhou
Zhipeng Zhou
Fan Wang
Hao Li
R. L. Jin
13
41
0
08 Sep 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
177
476
0
12 Aug 2021
PSViT: Better Vision Transformer via Token Pooling and Attention Sharing
Boyu Chen
Peixia Li
Baopu Li
Chuming Li
Lei Bai
Chen Lin
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ViT
65
33
0
07 Aug 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
349
500
0
13 Jul 2021
KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang
Xue Wang
F. Wang
Ming Lin
Shuning Chang
Hao Li
R. L. Jin
ViT
32
105
0
28 May 2021
Decoupled Dynamic Filter Networks
Jingkai Zhou
Varun Jampani
Zhixiong Pi
Qiong Liu
Ming-Hsuan Yang
43
108
0
29 Apr 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
111
209
0
26 Apr 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
284
1,524
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,622
0
24 Feb 2021
TransReID: Transformer-based Object Re-Identification
Shuting He
Haowen Luo
Pichao Wang
F. Wang
Hao Li
Wei Jiang
ViT
213
794
0
08 Feb 2021
TransTrack: Multiple Object Tracking with Transformer
Pei Sun
Jinkun Cao
Yi-Xin Jiang
Rufeng Zhang
Enze Xie
Zehuan Yuan
Changhu Wang
Ping Luo
ViT
VOT
243
565
0
31 Dec 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
950
20,561
0
17 Apr 2017
A Non-convex One-Pass Framework for Generalized Factorization Machine and Rank-One Matrix Sensing
Ming Lin
Jieping Ye
37
21
0
21 Aug 2016
1