15
9

SelfFed: Self-Supervised Federated Learning for Data Heterogeneity and Label Scarcity in Medical Images

Abstract

Self-supervised learning in the federated learning paradigm has been gaining a lot of interest both in industry and research due to the collaborative learning capability on unlabeled yet isolated data. However, self-supervised based federated learning strategies suffer from performance degradation due to label scarcity and diverse data distributions, i.e., data heterogeneity. In this paper, we propose the SelfFed framework for medical images to overcome data heterogeneity and label scarcity issues. The first phase of the SelfFed framework helps to overcome the data heterogeneity issue by leveraging the pre-training paradigm that performs augmentative modeling using Swin Transformer-based encoder in a decentralized manner. The label scarcity issue is addressed by fine-tuning paradigm that introduces a contrastive network and a novel aggregation strategy. We perform our experimental analysis on publicly available medical imaging datasets to show that SelfFed performs better when compared to existing baselines and works. Our method achieves a maximum improvement of 8.8% and 4.1% on Retina and COVID-FL datasets on non-IID datasets. Further, our proposed method outperforms existing baselines even when trained on a few (10%) labeled instances.

View on arXiv
@article{khowaja2025_2307.01514,
  title={ SelfFed: Self-Supervised Federated Learning for Data Heterogeneity and Label Scarcity in Medical Images },
  author={ Sunder Ali Khowaja and Kapal Dev and Syed Muhammad Anwar and Marius George Linguraru },
  journal={arXiv preprint arXiv:2307.01514},
  year={ 2025 }
}
Comments on this paper