GSFF-SLAM: 3D Semantic Gaussian Splatting SLAM via Feature Field

Semantic-aware 3D scene reconstruction is essential for autonomous robots to perform complex interactions. Semantic SLAM, an online approach, integrates pose tracking, geometric reconstruction, and semantic mapping into a unified framework, shows significant potential. However, existing systems, which rely on 2D ground truth priors for supervision, are often limited by the sparsity and noise of these signals in real-world environments. To address this challenge, we propose GSFF-SLAM, a novel dense semantic SLAM system based on 3D Gaussian Splatting that leverages feature fields to achieve joint rendering of appearance, geometry, and N-dimensional semantic features. By independently optimizing feature gradients, our method supports semantic reconstruction using various forms of 2D priors, particularly sparse and noisy signals. Experimental results demonstrate that our approach outperforms previous methods in both tracking accuracy and photorealistic rendering quality. When utilizing 2D ground truth priors, GSFF-SLAM achieves state-of-the-art semantic segmentation performance with 95.03\% mIoU, while achieving up to 2.9 speedup with only marginal performance degradation.
View on arXiv@article{lu2025_2504.19409, title={ GSFF-SLAM: 3D Semantic Gaussian Splatting SLAM via Feature Field }, author={ Zuxing Lu and Xin Yuan and Shaowen Yang and Jingyu Liu and Jiawei Wang and Changyin Sun }, journal={arXiv preprint arXiv:2504.19409}, year={ 2025 } }