Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.13840
Cited By
Twins: Revisiting the Design of Spatial Attention in Vision Transformers
28 April 2021
Xiangxiang Chu
Zhi Tian
Yuqing Wang
Bo Zhang
Haibing Ren
Xiaolin K. Wei
Huaxia Xia
Chunhua Shen
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Twins: Revisiting the Design of Spatial Attention in Vision Transformers"
50 / 188 papers shown
Title
Multi-modal wound classification using wound image and location by Xception and Gaussian Mixture Recurrent Neural Network (GMRNN)
Ramin Mousa
Ehsan Matbooe
Hakimeh Khojasteh
Amirali Bengari
Mohammadmahdi Vahediahmar
26
1
0
12 May 2025
Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation
Yi-Mou Lin
Dong-Ming Zhang
X. B. Fang
Yufan Chen
K.-T. Cheng
Hao Chen
33
0
0
06 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
148
0
0
06 May 2025
A Simple DropConnect Approach to Transfer-based Targeted Attack
Tongrui Su
Qingbin Li
Shengyu Zhu
Wei Chen
Xueqi Cheng
AAML
69
0
0
24 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
82
0
0
03 Apr 2025
HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views
Ethan Griffiths
Maryam Haghighat
Simon Denman
Clinton Fookes
Milad Ramezani
3DPC
59
0
0
11 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
62
0
0
08 Mar 2025
Machine learning for modelling unstructured grid data in computational physics: a review
Sibo Cheng
Marc Bocquet
Weiping Ding
Tobias S. Finn
Rui Fu
...
Yong Zeng
Mingrui Zhang
Hao Zhou
Kewei Zhu
Rossella Arcucci
PINN
AI4CE
114
0
0
13 Feb 2025
V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection
Sichao Wang
Chuang Zhang
Ming Yuan
Qing Xu
Lei He
Jianqiang Wang
49
1
0
28 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
152
0
0
21 Jan 2025
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Jialin Wu
Kaikai Pan
Yanjiao Chen
Jiangyi Deng
Shengyuan Pang
Wenyuan Xu
ViT
AAML
43
0
0
13 Jan 2025
Improving Transferable Targeted Attacks with Feature Tuning Mixup
K. Liang
Xuelong Dai
Yanjie Li
Dong Wang
Bin Xiao
AAML
155
0
0
23 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
S
4
^4
4
ST: A Strong, Self-transferable, faSt, and Simple Scale Transformation for Transferable Targeted Attack
Yongxiang Liu
Bowen Peng
Li Liu
X. Li
113
0
0
13 Oct 2024
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang
Jia wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
82
19
0
03 Oct 2024
TransDAE: Dual Attention Mechanism in a Hierarchical Transformer for Efficient Medical Image Segmentation
Bobby Azad
Pourya Adibfar
Kaiqun Fu
ViT
MedIm
24
0
0
03 Sep 2024
MacFormer: Semantic Segmentation with Fine Object Boundaries
Guoan Xu
Wenfeng Huang
Tao Wu
Ligeng Chen
Wenjing Jia
Guangwei Gao
Xiatian Zhu
Stuart W. Perry
40
0
0
11 Aug 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
40
7
0
28 Jul 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
52
2
0
22 Jul 2024
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification
Omar S. El-Assiouti
Ghada Hamed
Dina Khattab
H. M. Ebied
37
1
0
10 Jul 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
45
56
0
10 Jul 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
52
2
0
22 May 2024
Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability
Juanjuan Weng
Zhiming Luo
Shaozi Li
AAML
36
1
0
06 May 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
44
6
0
17 Apr 2024
SpiralMLP: A Lightweight Vision MLP Architecture
Haojie Mu
Burhan Ul Tayyab
Nicholas Chua
43
0
0
31 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
25
15
0
18 Mar 2024
Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution
Haochen Sun
Yan Yuan
Lijuan Su
Hao-Yu Shao
41
1
0
12 Mar 2024
Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration
Jingyun Xue
Tao Wang
Jun Wang
Kaihao Zhang
ViT
45
2
0
09 Mar 2024
ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement
Zixiang Wei
Yiting Wang
Lichao Sun
Athanasios V. Vasilakos
Lin Wang
36
0
0
20 Dec 2023
Point Deformable Network with Enhanced Normal Embedding for Point Cloud Analysis
Xingyilang Yin
Xi Yang
Liangchen Liu
Nannan Wang
Xinbo Gao
3DPC
31
3
0
20 Dec 2023
LMD: Faster Image Reconstruction with Latent Masking Diffusion
Zhiyuan Ma
Zhihuan Yu
Jianjun Li
Bowen Zhou
DiffM
24
8
0
13 Dec 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
F. Fleuret
37
2
0
01 Nov 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
44
36
0
30 Oct 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation
Tan-Hanh Pham
Xianqi Li
Kim-Doang Nguyen
MedIm
ViT
26
8
0
16 Oct 2023
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
82
7
0
14 Sep 2023
CNN Injected Transformer for Image Exposure Correction
Shuning Xu
Xiangyu Chen
Binbin Song
Jiantao Zhou
ViT
19
6
0
08 Sep 2023
TurboViT: Generating Fast Vision Transformers via Generative Architecture Search
Alexander Wong
Saad Abbasi
Saeejith Nair
ViT
27
1
0
22 Aug 2023
FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving
Zhonghua Yi
Haowen Shi
Kailun Yang
Qi Jiang
Yaozu Ye
Ze Wang
Huajian Ni
Kaiwei Wang
3DPC
20
9
0
14 Aug 2023
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Boyan Li
Luziwei Leng
Shuaijie Shen
Kaixuan Zhang
Jianguo Zhang
Jianxing Liao
Ran Cheng
28
7
0
21 Jun 2023
R-Mixup: Riemannian Mixup for Biological Networks
Xuan Kan
Zimu Li
Hejie Cui
Yue Yu
Ran Xu
Shaojun Yu
Zilong Zhang
Ying Guo
Carl Yang
33
6
0
05 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
42
28
0
01 Jun 2023
Bi-VLGM : Bi-Level Class-Severity-Aware Vision-Language Graph Matching for Text Guided Medical Image Segmentation
Wenting Chen
Jie Liu
Yixuan Yuan
VLM
39
3
0
20 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture
D. Gonçalves
J. M. Junior
Pedro Zamboni
H. Pistori
Jonathan Li
Keiller Nogueira
W. Gonçalves
35
5
0
04 May 2023
MASK-CNN-Transformer For Real-Time Multi-Label Weather Recognition
Shengchao Chen
Ting Shu
Huani Zhao
Yuan Yan Tang
ViT
32
15
0
28 Apr 2023
UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite
Sicen Guo
Jiahang Li
Yi Feng
Dacheng Zhou
D. Zhang
Chen Chen
Shuai Su
Xing-Hui Zhu
Qijun Chen
Rui Fan
26
6
0
18 Apr 2023
Why Existing Multimodal Crowd Counting Datasets Can Lead to Unfulfilled Expectations in Real-World Applications
M. Thissen
Elke Hergenröther
19
1
0
13 Apr 2023
From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot Keypoint Detection
Changsheng Lu
Hao Zhu
Piotr Koniusz
48
11
0
06 Apr 2023
1
2
3
4
Next