67
0

Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation

Main:9 Pages
4 Figures
Bibliography:4 Pages
9 Tables
Appendix:2 Pages
Abstract

Multi-modal semantic segmentation (MMSS) faces significant challenges in real-world scenarios due to dynamic environments, sensor failures, and noise interference, creating a gap between theoretical models and practical performance. To address this, we propose a two-stage framework called RobustSeg, which enhances multi-modal robustness through two key components: the Hybrid Prototype Distillation Module (HPDM) and the Representation Regularization Module (RRM). In the first stage, RobustSeg pre-trains a multi-modal teacher model using complete modalities. In the second stage, a student model is trained with random modality dropout while learning from the teacher via HPDM and RRM. HPDM transforms features into compact prototypes, enabling cross-modal hybrid knowledge distillation and mitigating bias from missing modalities. RRM reduces representation discrepancies between the teacher and student by optimizing functional entropy through the log-Sobolev inequality. Extensive experiments on three public benchmarks demonstrate that RobustSeg outperforms previous state-of-the-art methods, achieving improvements of +2.76%, +4.56%, and +0.98%, respectively. Code is available at:this https URL.

View on arXiv
@article{tan2025_2505.12861,
  title={ Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation },
  author={ Jiaqi Tan and Xu Zheng and Yang Liu },
  journal={arXiv preprint arXiv:2505.12861},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.