Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention

3 February 2025

Abstract

State-space models (SSMs) have recently emerged as a compelling alternative to Transformers for sequence modeling tasks. This paper presents a theoretical generalization analysis of selective SSMs, the core architectural component behind the Mamba model. We derive a novel covering number-based generalization bound for selective SSMs, building upon recent theoretical advances in the analysis of Transformer models. Using this result, we analyze how the spectral abscissa of the continuous-time state matrix governs the model's training dynamics and its ability to generalize across sequence lengths. We empirically validate our findings on a synthetic majority task and the IMDb sentiment classification benchmark, illustrating how our theoretical insights translate into practical model behavior.

View on arXiv

@article{honarpisheh2025_2502.01473,
  title={ Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention },
  author={ Arya Honarpisheh and Mustafa Bozdag and Octavia Camps and Mario Sznaier },
  journal={arXiv preprint arXiv:2502.01473},
  year={ 2025 }
}

Comments on this paper