ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.22255
36
0

Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

28 May 2025
Vadim Kurochkin
Yaroslav Aksenov
Daniil Laptev
Daniil Gavrilov
Nikita Balagansky
ArXivPDFHTML
Abstract

Sparse Autoencoders (SAEs) have demonstrated significant promise in interpreting the hidden states of language models by decomposing them into interpretable latent directions. However, training SAEs at scale remains challenging, especially when large dictionary sizes are used. While decoders can leverage sparse-aware kernels for efficiency, encoders still require computationally intensive linear operations with large output dimensions. To address this, we propose KronSAE, a novel architecture that factorizes the latent representation via Kronecker product decomposition, drastically reducing memory and computational overhead. Furthermore, we introduce mAND, a differentiable activation function approximating the binary AND operation, which improves interpretability and performance in our factorized framework.

View on arXiv
@article{kurochkin2025_2505.22255,
  title={ Train Sparse Autoencoders Efficiently by Utilizing Features Correlation },
  author={ Vadim Kurochkin and Yaroslav Aksenov and Daniil Laptev and Daniil Gavrilov and Nikita Balagansky },
  journal={arXiv preprint arXiv:2505.22255},
  year={ 2025 }
}
Comments on this paper