ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.02040
142
0
v1v2 (latest)

A Separable Self-attention Inspired by the State Space Model for Computer Vision

3 January 2025
Juntao Zhang
Shaogeng Liu
Kun Bian
You Zhou
Pei Zhang
Jianning Liu
Jun Zhou
Bingyan Liu
    Mamba
ArXiv (abs)PDFHTML
Main:6 Pages
5 Figures
Bibliography:2 Pages
6 Tables
Abstract

Mamba is an efficient State Space Model (SSM) with linear computational complexity. Although SSMs are not suitable for handling non-causal data, Vision Mamba (ViM) methods still demonstrate good performance in tasks such as image classification and object detection. Recent studies have shown that there is a rich theoretical connection between state space models and attention variants. We propose a novel separable self attention method, for the first time introducing some excellent design concepts of Mamba into separable self-attention. To ensure a fair comparison with ViMs, we introduce VMINet, a simple yet powerful prototype architecture, constructed solely by stacking our novel attention modules with the most basic down-sampling layers. Notably, VMINet differs significantly from the conventional Transformer architecture. Our experiments demonstrate that VMINet has achieved competitive results on image classification and high-resolution dense predictionthis http URLis available at: \url{this https URL}.

View on arXiv
Comments on this paper