Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.00641
Cited By
Focal Self-attention for Local-Global Interactions in Vision Transformers
1 July 2021
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Focal Self-attention for Local-Global Interactions in Vision Transformers"
50 / 252 papers shown
Title
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
Jiquan Shan
Junxiao Wang
Lifeng Zhao
Liang Cai
Hongyuan Zhang
Ioannis Liritzis
ViT
247
0
0
22 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
527
0
0
06 May 2025
Crafting Query-Aware Selective Attention for Single Image Super-Resolution
Junyoung Kim
Youngrok Kim
Siyeol Jung
Donghyun Min
91
0
0
09 Apr 2025
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation
Bo Yin
Jiao-Long Cao
Ming-Ming Cheng
Qibin Hou
3DPC
MDE
95
0
0
07 Apr 2025
Atlas: Multi-Scale Attention Improves Long Context Image Modeling
Kumar Krishna Agrawal
Long Lian
Lu Liu
Natalia Harguindeguy
Boyi Li
Alexander Bick
Maggie Chung
Trevor Darrell
Adam Yala
ViT
89
0
0
16 Mar 2025
DCAT: Dual Cross-Attention Fusion for Disease Classification in Radiological Images with Uncertainty Estimation
Jutika Borah
H. Singh
MedIm
170
0
0
14 Mar 2025
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
Anzhe Cheng
Chenzhong Yin
Yu Chang
Heng Ping
Shixuan Li
Shahin Nazarian
Paul Bogdan
SSeg
288
0
0
11 Mar 2025
STARFormer: A Novel Spatio-Temporal Aggregation Reorganization Transformer of FMRI for Brain Disorder Diagnosis
Wenhao Dong
Yuchen Li
Weiming Zeng
Lei Chen
Hongjie Yan
W. Siok
Nizhuan Wang
71
1
0
03 Jan 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
324
734
0
31 Dec 2024
Real Classification by Description: Extending CLIP's Limits of Part Attributes Recognition
Ethan Baron
Idan Tankel
Peter Tu
Guy Ben-Yosef
VLM
142
0
0
18 Dec 2024
Bridging the Divide: Reconsidering Softmax and Linear Attention
Dongchen Han
Yifan Pu
Zhuofan Xia
Yizeng Han
Xuran Pan
Xiu Li
Jiwen Lu
Shiji Song
Gao Huang
136
12
0
09 Dec 2024
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training
Man Yao
Xuerui Qiu
Tianxiang Hu
J. Hu
Yuhong Chou
Keyu Tian
Jianxing Liao
Luziwei Leng
Bo Xu
Guoqi Li
152
16
0
25 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
113
2
0
12 Nov 2024
Event-guided Low-light Video Semantic Segmentation
Zhen Yao
Mooi Choo Choo Chuah
89
6
0
01 Nov 2024
COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes
Muhammad Ali
Mamoona Javaid
Mubashir Noman
Mustansar Fiaz
Salman Khan
87
0
0
31 Oct 2024
PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views
Xin Fei
Wenzhao Zheng
Yueqi Duan
Weidong Zhan
Masayoshi Tomizuka
Kurt Keutzer
Jiwen Lu
3DGS
79
6
0
24 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
105
19
0
15 Oct 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
Nguyen Huu Bao Long
Chenyu Zhang
Yuzhi Shi
Tsubasa Hirakawa
Takayoshi Yamashita
Tohgoroh Matsui
H. Fujiyoshi
68
2
0
11 Oct 2024
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering
Kazumoto Nakamura
Yuji Nozawa
Yu-Chieh Lin
K. Nakata
Youyang Ng
ViT
71
2
0
07 Oct 2024
CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin Transformer with Block Level CBAM Enhancement
Jiayi Zhao
Alison Wun-lam Yeung
Ali Muhammad
Songjiang Lai
Vincent To-Yee NG
50
3
0
30 Sep 2024
Insight Any Instance: Promptable Instance Segmentation for Remote Sensing Images
Xuexue Li
VLM
ISeg
94
0
0
11 Sep 2024
A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships
Gracile Astlin Pereira
Muhammad Hussain
ViT
70
10
0
27 Aug 2024
Efficient Visual Representation Learning with Heat Conduction Equation
Zhemin Zhang
Xun Gong
DiffM
3DV
91
0
0
12 Aug 2024
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
84
3
0
24 Jul 2024
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds
Yanbo Wang
Wentao Zhao
Chuan Cao
Tianchen Deng
Jingchuan Wang
Weidong Chen
3DPC
95
7
0
16 Jul 2024
iiANET: Inception Inspired Attention Hybrid Network for efficient Long-Range Dependency
Haruna Yunusa
Qin Shiyin
Abdulrahman Hamman Adama Chukkol
Isah Bello
A. Lawan
Isah Bello
108
4
0
10 Jul 2024
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
Qi Ma
Danda Pani Paudel
E. Konukoglu
Luc Van Gool
108
6
0
25 Jun 2024
Fusion of regional and sparse attention in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
62
1
0
13 Jun 2024
You Only Need Less Attention at Each Stage in Vision Transformers
Shuoxi Zhang
Hanpeng Liu
Stephen Lin
Kun He
88
5
0
01 Jun 2024
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
You Huang
Zongyu Lan
Liujuan Cao
Xianming Lin
Shengchuan Zhang
Guannan Jiang
Rongrong Ji
VLM
61
2
0
29 May 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
121
4
0
28 May 2024
Demystify Mamba in Vision: A Linear Attention Perspective
Dongchen Han
Ziyi Wang
Zhuofan Xia
Yizeng Han
Yifan Pu
Chunjiang Ge
Jun Song
Shiji Song
Bo Zheng
Gao Huang
Mamba
132
62
0
26 May 2024
Building Vision Models upon Heat Conduction
Zhaozhi Wang
Yue Liu
Yunfan Liu
Hongtian Yu
Yaowei Wang
QiXiang Ye
ViT
VLM
104
0
0
26 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
115
3
0
22 May 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
88
6
0
22 May 2024
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
96
3
0
22 May 2024
Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network
Min Hun Lee
AI4TS
ViT
FAtt
59
3
0
18 May 2024
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Zhenghong Li
Jiaxiang Ren
Wensheng Cheng
C. Du
Yingtian Pan
Haibin Ling
68
0
0
26 Apr 2024
Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Haotian Yan
Ming Wu
Chuang Zhang
107
14
0
25 Apr 2024
Modeling Multi-Granularity Context Information Flow for Pavement Crack Detection
Junbiao Pang
Baocheng Xiong
Jiaqi Wu
51
0
0
19 Apr 2024
Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis
Weiyu Guo
Ziyue Qiao
Ying Sun
Hui Xiong
46
1
0
17 Apr 2024
LUCF-Net: Lightweight U-shaped Cascade Fusion Network for Medical Image Segmentation
Songkai Sun
Qingshan She
Yuliang Ma
Rihui Li
Yingchun Zhang
MedIm
66
1
0
11 Apr 2024
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
102
11
0
05 Apr 2024
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
90
0
0
31 Mar 2024
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
Donghyun Kim
Byeongho Heo
Dongyoon Han
90
17
0
28 Mar 2024
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Badri N. Patro
Suhas Ranganath
Vinay P. Namboodiri
Vijay Srinivas Agneeswaran
99
3
0
26 Mar 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
126
99
0
26 Mar 2024
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
Chunlong Xia
Xinliang Wang
Feng Lv
Xin Hao
Yifeng Shi
ViT
101
57
0
12 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
97
3
0
07 Mar 2024
Interactive Multi-Head Self-Attention with Linear Complexity
Hankyul Kang
Ming-Hsuan Yang
Jongbin Ryu
55
1
0
27 Feb 2024
1
2
3
4
5
6
Next