Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.15193
Cited By
v1
v2 (latest)
Shunted Self-Attention via Multi-Scale Token Aggregation
30 November 2021
Sucheng Ren
Daquan Zhou
Shengfeng He
Jiashi Feng
Xinchao Wang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (216★)
Papers citing
"Shunted Self-Attention via Multi-Scale Token Aggregation"
50 / 85 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
525
0
0
06 May 2025
Towards Accurate and Interpretable Neuroblastoma Diagnosis via Contrastive Multi-scale Pathological Image Analysis
Zhu Zhu
Shuo Jiang
Jingyuan Zheng
Yawen Li
Yifei Chen
Manli Zhao
Weizhong Gu
Feiwei Qin
Jinhu Wang
Gang Yu
MedIm
209
0
0
18 Apr 2025
Multi-Modal Brain Tumor Segmentation via 3D Multi-Scale Self-attention and Cross-attention
Yonghao Huang
Leiting Chen
Chuan Zhou
ViT
MedIm
65
0
0
12 Apr 2025
Multi-modal and Multi-view Fundus Image Fusion for Retinopathy Diagnosis via Multi-scale Cross-attention and Shifted Window Self-attention
Yonghao Huang
Leiting Chen
Chuan Zhou
57
0
0
12 Apr 2025
Mixed-granularity Implicit Representation for Continuous Hyperspectral Compressive Reconstruction
Jianan Li
Huan Chen
Wangcai Zhao
Rui Chen
Tingfa Xu
110
0
0
17 Mar 2025
RhythmFormer: Extracting Patterned rPPG Signals based on Periodic Sparse Attention
Bochao Zou
Zizheng Guo
Jiansheng Chen
Junbao Zhuo
Weiran Huang
Huimin Ma
ViT
AI4TS
166
1
0
21 Feb 2025
HResFormer: Hybrid Residual Transformer for Volumetric Medical Image Segmentation
Sucheng Ren
Xiaomeng Li
MedIm
100
3
0
16 Dec 2024
Cascaded Multi-Scale Attention for Enhanced Multi-Scale Feature Extraction and Interaction with Low-Resolution Images
Xiangyong Lu
Masanori Suganuma
Takayuki Okatani
143
0
0
03 Dec 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
111
2
0
12 Nov 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
Nguyen Huu Bao Long
Chenyu Zhang
Yuzhi Shi
Tsubasa Hirakawa
Takayoshi Yamashita
Tohgoroh Matsui
H. Fujiyoshi
66
2
0
11 Oct 2024
SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks
Meng Lou
Yunxiang Fu
Yizhou Yu
Mamba
109
5
0
15 Sep 2024
PSTNet: Enhanced Polyp Segmentation with Multi-scale Alignment and Frequency Domain Integration
Wenhao Xu
Rongtao Xu
Changwei Wang
Xiuli Li
Shibiao Xu
Li Guo
80
5
0
13 Sep 2024
Brain-Inspired Stepwise Patch Merging for Vision Transformers
Yonghao Yu
Dongcheng Zhao
Guobin Shen
Yiting Dong
Yi Zeng
86
0
0
11 Sep 2024
Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation
Hyunwoo Yu
Yubin Cho
Beoungwoo Kang
Seunghun Moon
Kyeongbo Kong
Suk-Ju Kang
78
3
0
24 Jul 2024
MxT: Mamba x Transformer for Image Inpainting
Shuang Chen
Amir Atapour-Abarghouei
Haozheng Zhang
Hubert P. H. Shum
Mamba
62
3
0
23 Jul 2024
UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
Jingjing Ren
Wenbo Li
Haoyu Chen
Renjing Pei
Bin Shao
Yong Guo
Long Peng
Fenglong Song
Lei Zhu
104
22
0
02 Jul 2024
Rethinking Remote Sensing Change Detection With A Mask View
Xiaowen Ma
Zhenkai Wu
Rongrong Lian
Wei Zhang
Siyang Song
70
3
0
21 Jun 2024
Vision Mamba: Cutting-Edge Classification of Alzheimer's Disease with 3D MRI Scans
Muthukumar K A
Amit Gurung
Priya Ranjan
71
1
0
09 Jun 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan Yuille
Cihang Xie
AI4TS
VGen
SSL
78
2
0
24 May 2024
Vision Transformer with Sparse Scan Prior
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
ViT
80
6
0
22 May 2024
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Qihang Fan
Huaibo Huang
Mingrui Chen
Ran He
79
3
0
22 May 2024
Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Haotian Yan
Ming Wu
Chuang Zhang
100
14
0
25 Apr 2024
HRVDA: High-Resolution Visual Document Assistant
Chaohu Liu
Kun Yin
Haoyu Cao
Xinghua Jiang
Xin Li
Yinsong Liu
Deqiang Jiang
Xing Sun
Linli Xu
VLM
102
26
0
10 Apr 2024
Unsegment Anything by Simulating Deformation
Jiahao Lu
Xingyi Yang
Xinchao Wang
99
4
0
03 Apr 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
119
99
0
26 Mar 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
149
4
0
17 Feb 2024
MsSVT++: Mixed-scale Sparse Voxel Transformer with Center Voting for 3D Object Detection
Jianan Li
Shaocong Dong
Lihe Ding
Tingfa Xu
3DPC
75
8
0
22 Jan 2024
MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation
Long Xu
Shanghong Li
Yongquan Chen
Jun Luo
Shiwu Lai
57
0
0
09 Jan 2024
Factorization Vision Transformer: Modeling Long Range Dependency with Local Window Cost
Haolin Qin
Daquan Zhou
Tingfa Xu
Ziyang Bian
Jianan Li
76
9
0
14 Dec 2023
BACTrack: Building Appearance Collection for Aerial Tracking
Xincong Liu
Tingfa Xu
Ying Wang
Zhinong Yu
Xiaoying Yuan
Haolin Qin
Jianan Li
89
8
0
11 Dec 2023
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Li Yuan
Jiangliu Wang
Yibing Song
Ping Luo
166
18
0
26 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
168
41
0
30 Oct 2023
IFAST: Weakly Supervised Interpretable Face Anti-spoofing from Single-shot Binocular NIR Images
Jiancheng Huang
Donghao Zhou
Shifeng Chen
CVBM
83
2
0
29 Sep 2023
Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation
Yijun Yang
Angelica I. Aviles-Rivero
Huazhu Fu
Ye Liu
Weiming Wang
Lei Zhu
ViT
66
15
0
24 Sep 2023
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Xiaoheng Jiang
Kaiyi Guo
Yang Lu
Feng Yan
Hao Liu
Jiale Cao
Mingliang Xu
Dacheng Tao
MedIm
ViT
UQCV
57
1
0
22 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
162
91
0
20 Sep 2023
Priority-Centric Human Motion Generation in Discrete Latent Space
Hanyang Kong
Kehong Gong
Dongze Lian
Michael Bi Mi
Xinchao Wang
DiffM
119
55
0
28 Aug 2023
Computation-efficient Deep Learning for Computer Vision: A Survey
Yulin Wang
Yizeng Han
Chaofei Wang
Shiji Song
Qi Tian
Gao Huang
VLM
130
20
0
27 Aug 2023
Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings
Yuhe Liu
Chuanjian Liu
Kai Han
Quan Tang
Zengchang Qin
68
5
0
24 Aug 2023
SG-Former: Self-guided Transformer with Evolving Token Reallocation
Sucheng Ren
Xingyi Yang
Songhua Liu
Xinchao Wang
ViT
80
43
0
23 Aug 2023
Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention
Liang Shang
Yanli Liu
Zhengyang Lou
Shuxue Quan
N. Adluru
Bochen Guan
W. Sethares
99
2
0
10 Aug 2023
MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical Image Segmentation
Liang Xu
Mingxi Chen
Yiyu Cheng
Pengfei Shao
Shuwei Shen
Peng Yao
Ronald X. Xu
ViT
62
0
0
27 Jul 2023
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection
Qichao Ying
Jiaxin Liu
Sheng Li
Haisheng Xu
Zhenxing Qian
Xinpeng Zhang
CVBM
85
8
0
20 Jul 2023
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
117
78
0
17 Jul 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
98
0
0
02 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
150
29
0
01 Jun 2023
PSLT: A Light-weight Vision Transformer with Ladder Self-Attention and Progressive Shift
Gaojie Wu
Weishi Zheng
Yutong Lu
Q. Tian
ViT
84
15
0
07 Apr 2023
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration
Kehong Gong
Dongze Lian
Heng Chang
Chuan Guo
Zihang Jiang
Wei Ji
Michael Bi Mi
Xinchao Wang
114
66
0
05 Apr 2023
CNNs with Multi-Level Attention for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
92
6
0
02 Apr 2023
APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding
Hengjia Li
Tu Zheng
Zhihao Chi
Zheng Yang
Wenxiao Wang
Boxi Wu
Binbin Lin
Deng Cai
3DPC
76
1
0
31 Mar 2023
1
2
Next