ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09556
31
0

S3S^3S3 -- Semantic Signal Separation

13 June 2024
Márton Kardos
Jan Kostkan
Arnault-Quentin Vermillet
Kristoffer Laigaard Nielbo
K. Enevoldsen
Roberta Rocca
ArXivPDFHTML
Abstract

Topic models are useful tools for discovering latent semantic structures in large textual corpora. Recent efforts have been oriented at incorporating contextual representations in topic modeling and have been shown to outperform classical topic models. These approaches are typically slow, volatile, and require heavy preprocessing for optimal results. We present Semantic Signal Separation (S3S^3S3), a theory-driven topic modeling approach in neural embedding spaces. S3S^3S3 conceptualizes topics as independent axes of semantic space and uncovers these by decomposing contextualized document embeddings using Independent Component Analysis. Our approach provides diverse and highly coherent topics, requires no preprocessing, and is demonstrated to be the fastest contextual topic model, being, on average, 4.5x faster than the runner-up BERTopic. We offer an implementation of S3S^3S3, and all contextual baselines, in the Turftopic Python package.

View on arXiv
@article{kardos2025_2406.09556,
  title={ $S^3$ -- Semantic Signal Separation },
  author={ Márton Kardos and Jan Kostkan and Arnault-Quentin Vermillet and Kristoffer Nielbo and Kenneth Enevoldsen and Roberta Rocca },
  journal={arXiv preprint arXiv:2406.09556},
  year={ 2025 }
}
Comments on this paper