Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention

State-space models (SSMs) have recently emerged as a compelling alternative to Transformers for sequence modeling tasks. This paper presents a theoretical generalization analysis of selective SSMs, the core architectural component behind the Mamba model. We derive a novel covering number-based generalization bound for selective SSMs, building upon recent theoretical advances in the analysis of Transformer models. Using this result, we analyze how the spectral abscissa of the continuous-time state matrix governs the model's training dynamics and its ability to generalize across sequence lengths. We empirically validate our findings on a synthetic majority task and the IMDb sentiment classification benchmark, illustrating how our theoretical insights translate into practical model behavior.
View on arXiv@article{honarpisheh2025_2502.01473, title={ Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention }, author={ Arya Honarpisheh and Mustafa Bozdag and Octavia Camps and Mario Sznaier }, journal={arXiv preprint arXiv:2502.01473}, year={ 2025 } }