Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00641
Cited By
Focal Self-attention for Local-Global Interactions in Vision Transformers
1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Focal Self-attention for Local-Global Interactions in Vision Transformers"
50 / 252 papers shown
Title
DMFormer: Closing the Gap Between CNN and Vision Transformers
Zimian Wei
H. Pan
Lujun Li
Menglong Lu
Xin-Yi Niu
Peijie Dong
Dongsheng Li
ViT
156
5
0
16 Sep 2022
CenterFormer: Center-based Transformer for 3D Object Detection
Zixiang Zhou
Xian Zhao
Yu Wang
Panqu Wang
H. Foroosh
3DPC
ViT
114
142
0
12 Sep 2022
MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition
Y. Wang
H. Sun
Xiaodi Wang
Bin Zhang
Chaonan Li
Ying Xin
Baochang Zhang
Errui Ding
Shumin Han
ViT
80
15
0
31 Aug 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
123
167
0
25 Aug 2022
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
62
1
0
23 Aug 2022
In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Bolin Lai
Miao Liu
Fiona Ryan
James M. Rehg
ViT
99
37
0
08 Aug 2022
TransMatting: Enhancing Transparent Objects Matting with Transformers
Huanqia Cai
Fanglei Xue
Lele Xu
Lili Guo
ViT
72
22
0
05 Aug 2022
TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection
Zhipeng Luo
Gongjie Zhang
Changqing Zhou
Ti Liu
Shijian Lu
Liang Pan
3DPC
ViT
93
9
0
04 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
45
0
0
01 Aug 2022
Global-Local Self-Distillation for Visual Representation Learning
Tim Lebailly
Tinne Tuytelaars
SSL
53
6
0
29 Jul 2022
COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers
Idil Aytekin
Onat Dalmaz
Kaan Gonc
H. Ankishan
E. Saritas
Ulas Bagci
H. Celik
Tolga Çukur
60
12
0
19 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
100
159
0
12 Jul 2022
LightViT: Towards Light-Weight Convolution-Free Vision Transformers
Tao Huang
Lang Huang
Shan You
Fei Wang
Chao Qian
Chang Xu
ViT
82
61
0
12 Jul 2022
Compound Prototype Matching for Few-shot Action Recognition
Yifei Huang
Lijin Yang
Yoichi Sato
132
44
0
12 Jul 2022
Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Ting Yao
Yingwei Pan
Yehao Li
Chong-Wah Ngo
Tao Mei
ViT
227
142
0
11 Jul 2022
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
233
82
0
11 Jul 2022
Self-attention on Multi-Shifted Windows for Scene Segmentation
Litao Yu
Zhibin Li
Jian Zhang
Qiang Wu
SSeg
112
1
0
10 Jul 2022
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu
Zhengzhong Tu
Hao Xiang
Wei Shao
Bolei Zhou
Jiaqi Ma
156
229
0
05 Jul 2022
Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention
Gary Leung
Jun Gao
Fangyin Wei
Sanja Fidler
82
3
0
05 Jul 2022
Polarized Color Image Denoising using Pocoformer
Zhuoxiao Li
Hai-bo Jiang
Yinqiang Zheng
94
3
0
01 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
79
0
0
01 Jul 2022
Deformable Graph Transformer
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
134
7
0
29 Jun 2022
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen
Jianhui Liu
Xinming Zhang
Xiaojuan Qi
Jiaya Jia
131
91
0
21 Jun 2022
Vicinity Vision Transformer
Weixuan Sun
Zhen Qin
Huiyuan Deng
Jianyuan Wang
Yi Zhang
Kaihao Zhang
Nick Barnes
Stan Birchfield
Lingpeng Kong
Yiran Zhong
ViT
73
34
0
21 Jun 2022
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
80
9
0
19 Jun 2022
Efficient Decoder-free Object Detection with Transformers
Peixian Chen
Mengdan Zhang
Yunhang Shen
Kekai Sheng
Yuting Gao
Xing Sun
Ke Li
Chunhua Shen
ViT
117
17
0
14 Jun 2022
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
82
33
0
14 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
154
37
0
01 Jun 2022
Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks
Jaonary Rabarisoa
Velentin Belissen
Florian Chabot
Q. C. Pham
VLM
ViT
SSL
MDE
45
3
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
106
29
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
Fast Vision Transformers with HiLo Attention
Zizheng Pan
Jianfei Cai
Bohan Zhuang
67
168
0
26 May 2022
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
124
201
0
25 May 2022
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions
Difan Liu
Sandesh Shetty
Tobias Hinz
Matthew Fisher
Richard Y. Zhang
Taesung Park
E. Kalogerakis
ViT
81
32
0
24 May 2022
BolT: Fused Window Transformers for fMRI Time Series Analysis
H. Bedel
Irmak Sivgin
Onat Dalmaz
S. Dar
Tolga Çukur
136
59
0
23 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
185
76
0
20 May 2022
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
182
572
0
17 May 2022
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
116
68
0
17 May 2022
Transformers in 3D Point Clouds: A Survey
Dening Lu
Qian Xie
Mingqiang Wei
Kyle Gao
Linlin Xu
Jonathan Li
3DPC
ViT
146
53
0
16 May 2022
Transformer Scale Gate for Semantic Segmentation
Hengcan Shi
Munawar Hayat
Jianfei Cai
ViT
95
24
0
14 May 2022
Adaptive Split-Fusion Transformer
Zixuan Su
Hao Zhang
Jingjing Chen
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
103
8
0
26 Apr 2022
OutfitTransformer: Learning Outfit Representations for Fashion Recommendation
Rohan Sarkar
Navaneeth Bodla
Mariya I. Vasileva
Yen-Liang Lin
Anu Beniwal
Alan Lu
Gérard Medioni
66
36
0
11 Apr 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
173
255
0
07 Apr 2022
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
90
32
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
91
58
0
06 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
88
109
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
173
676
0
04 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
86
17
0
04 Apr 2022
COOL, a Context Outlooker, and its Application to Question Answering and other Natural Language Processing Tasks
Fangyi Zhu
See-Kiong Ng
S. Bressan
LRM
63
1
0
01 Apr 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
103
66
0
29 Mar 2022
Previous
1
2
3
4
5
6
Next