ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.00641
  4. Cited By
Focal Self-attention for Local-Global Interactions in Vision
  Transformers

Focal Self-attention for Local-Global Interactions in Vision Transformers

1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
    ViT
ArXiv (abs)PDFHTML

Papers citing "Focal Self-attention for Local-Global Interactions in Vision Transformers"

50 / 252 papers shown
Title
Multi-Human Mesh Recovery with Transformers
Multi-Human Mesh Recovery with Transformers
Zeyu Wang
Zhenzhen Weng
Serena Yeung-Levy
3DH
57
1
0
26 Feb 2024
ToDo: Token Downsampling for Efficient Generation of High-Resolution
  Images
ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
Ethan Smith
Nayan Saxena
Aninda Saha
DiffM
51
6
0
21 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
158
35
0
05 Feb 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
59
0
0
26 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
125
817
0
17 Jan 2024
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
Mingxin Huang
Dezhi Peng
Hongliang Li
Zhenghao Peng
Chongyu Liu
Dahua Lin
Yuliang Liu
Xiang Bai
Lianwen Jin
196
1
0
15 Jan 2024
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio
  Detection
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
Lian Huang
Chi-Man Pun
66
5
0
11 Jan 2024
Deformable Audio Transformer for Audio Event Detection
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
81
0
0
24 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
103
88
0
14 Dec 2023
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel
  Size might be All You Need
The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need
Tianjin Huang
Tianlong Chen
Zhangyang Wang
Shiwei Liu
80
1
0
09 Dec 2023
DocBinFormer: A Two-Level Transformer Network for Effective Document
  Image Binarization
DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization
Risab Biswas
Swalpa Kumar Roy
Ning Wang
Umapada Pal
Guang-Bin Huang
ViT
29
1
0
06 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
177
0
0
01 Dec 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
51
1
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
93
98
0
28 Nov 2023
Advancing Vision Transformers with Group-Mix Attention
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Lichao Sun
Jiangliu Wang
Yibing Song
Ping Luo
180
18
0
26 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision
  Tasks
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
141
175
0
10 Nov 2023
Vision Big Bird: Random Sparsification for Full Attention
Vision Big Bird: Random Sparsification for Full Attention
Zhemin Zhang
Xun Gong
ViT
70
1
0
10 Nov 2023
Scattering Vision Transformer: Spectral Mixing Matters
Scattering Vision Transformer: Spectral Mixing Matters
Badri N. Patro
Vijay Srinivas Agneeswaran
121
15
0
02 Nov 2023
VST++: Efficient and Stronger Visual Saliency Transformer
VST++: Efficient and Stronger Visual Saliency Transformer
Nian Liu
Ziyang Luo
Ni Zhang
Junwei Han
ViT
73
20
0
18 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
103
4
0
10 Oct 2023
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient
  Vision Transformers
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Shiyue Cao
Yueqin Yin
Lianghua Huang
Yu Liu
Xin Zhao
Deli Zhao
Kaiqi Huang
ViT
100
19
0
09 Oct 2023
Single Stage Warped Cloth Learning and Semantic-Contextual Attention
  Feature Fusion for Virtual TryOn
Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn
Sanhita Pathak
V. Kaushik
Brejesh Lall
DiffM
85
2
0
08 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
233
3
0
08 Oct 2023
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix
  Factorization via Plastic Transformer
Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer
Xiaofeng Liu
Fangxu Xing
Maureen Stone
Jiachen Zhuo
S. Fels
Jerry L. Prince
Xiaofeng Liu
Jonghye Woo
MedIm
64
3
0
26 Sep 2023
CINFormer: Transformer network with multi-stage CNN feature injection
  for surface defect segmentation
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Xiaoheng Jiang
Kaiyi Guo
Yang Lu
Feng Yan
Hao Liu
Jiale Cao
Mingliang Xu
Dacheng Tao
MedImViTUQCV
64
1
0
22 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
170
91
0
20 Sep 2023
Mask-Attention-Free Transformer for 3D Instance Segmentation
Mask-Attention-Free Transformer for 3D Instance Segmentation
Xin Lai
Yuhui Yuan
Ruihang Chu
Yukang Chen
Han Hu
Jiaya Jia
MedImISeg3DPC
102
31
0
04 Sep 2023
RevColV2: Exploring Disentangled Representations in Masked Image
  Modeling
RevColV2: Exploring Disentangled Representations in Masked Image Modeling
Qi Han
Yuxuan Cai
Xiangyu Zhang
123
8
0
02 Sep 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
141
21
0
27 Aug 2023
Vision Transformer Adapters for Generalizable Multitask Learning
Vision Transformer Adapters for Generalizable Multitask Learning
Deblina Bhattacharjee
Sabine Süsstrunk
Mathieu Salzmann
ViT
91
8
0
23 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
SG-Former: Self-guided Transformer with Evolving Token Reallocation
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
88
43
0
23 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling
  Aggregation Modulation
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
70
9
0
22 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
Revisiting Vision Transformer from the View of Path Ensemble
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
66
3
0
12 Aug 2023
DiT: Efficient Vision Transformers with Dynamic Token Routing
DiT: Efficient Vision Transformers with Dynamic Token Routing
Yuchen Ma
Zhengcong Fei
Junshi Huang
ViT
67
2
0
07 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
124
15
0
01 Aug 2023
When Multi-Task Learning Meets Partial Supervision: A Computer Vision
  Review
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review
Maxime Fontana
Michael W. Spratling
Miaojing Shi
89
7
0
25 Jul 2023
Scale-Aware Modulation Meet Transformer
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoEViT
117
78
0
17 Jul 2023
Adaptive Window Pruning for Efficient Local Motion Deblurring
Adaptive Window Pruning for Efficient Local Motion Deblurring
Haoying Li
Jixin Zhao
Shangchen Zhou
H. Feng
Chongyi Li
Chen Change Loy
ViT
82
5
0
25 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
113
13
0
08 Jun 2023
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Collect-and-Distribute Transformer for 3D Point Cloud Analysis
Haibo Qiu
Baosheng Yu
Dacheng Tao
3DPCViT
104
7
0
02 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
170
29
0
01 Jun 2023
Dual Path Transformer with Partition Attention
Dual Path Transformer with Partition Attention
Zhengkai Jiang
Liang Liu
Jiangning Zhang
Yabiao Wang
Mingang Chen
Chengjie Wang
ViT
100
2
0
24 May 2023
OctFormer: Octree-based Transformers for 3D Point Clouds
OctFormer: Octree-based Transformers for 3D Point Clouds
Peng-Shuai Wang
ViT3DPC
83
88
0
04 May 2023
AxWin Transformer: A Context-Aware Vision Transformer Backbone with
  Axial Windows
AxWin Transformer: A Context-Aware Vision Transformer Backbone with Axial Windows
Fangjian Lin
Yizhe Ma
Sitong Wu
Long Yu
Sheng Tian
ViT
44
5
0
02 May 2023
UniNeXt: Exploring A Unified Architecture for Vision Recognition
UniNeXt: Exploring A Unified Architecture for Vision Recognition
Fangjian Lin
Jianlong Yuan
Sitong Wu
Fan Wang
Zhibin Wang
ViT
85
14
0
26 Apr 2023
MixPro: Data Augmentation with MaskMix and Progressive Attention
  Labeling for Vision Transformer
MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer
QiHao Zhao
Yangyu Huang
Wei Hu
Fan Zhang
Jing Liu
ViT
75
16
0
24 Apr 2023
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous
  System
MMDR: A Result Feature Fusion Object Detection Approach for Autonomous System
Wendong Zhang
53
0
0
19 Apr 2023
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene
  Understanding
Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding
Yu-Qi Yang
Yu-Xiao Guo
Jiangfeng Xiong
Yang Liu
Hao Pan
Peng-Shuai Wang
Xin Tong
B. Guo
ViT
108
88
0
14 Apr 2023
SpectFormer: Frequency and Attention is what you need in a Vision
  Transformer
SpectFormer: Frequency and Attention is what you need in a Vision Transformer
Badri N. Patro
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
ViT
94
49
0
13 Apr 2023
RSIR Transformer: Hierarchical Vision Transformer using Random Sampling
  Windows and Important Region Windows
RSIR Transformer: Hierarchical Vision Transformer using Random Sampling Windows and Important Region Windows
Zhemin Zhang
Xun Gong
ViT
44
1
0
13 Apr 2023
Previous
123456
Next