12
0

Clustered Federated Learning via Embedding Distributions

Main:9 Pages
17 Figures
Bibliography:3 Pages
11 Tables
Appendix:12 Pages
Abstract

Federated learning (FL) is a widely used framework for machine learning in distributed data environments where clients hold data that cannot be easily centralised, such as for data protection reasons. FL, however, is known to be vulnerable to non-IID data. Clustered FL addresses this issue by finding more homogeneous clusters of clients. We propose a novel one-shot clustering method, EMD-CFL, using the Earth Mover's distance (EMD) between data distributions in embedding space. We theoretically motivate the use of EMDs using results from the domain adaptation literature and demonstrate empirically superior clustering performance in extensive comparisons against 16 baselines and on a range of challenging datasets.

View on arXiv
@article{zhang2025_2506.07769,
  title={ Clustered Federated Learning via Embedding Distributions },
  author={ Dekai Zhang and Matthew Williams and Francesca Toni },
  journal={arXiv preprint arXiv:2506.07769},
  year={ 2025 }
}
Comments on this paper