33
0

Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Abstract

The rise of electronic health records (EHRs) has unlocked new opportunities for medical research, but privacy regulations and data heterogeneity remain key barriers to large-scale machine learning. Federated learning (FL) enables collaborative modeling without sharing raw data, yet faces challenges in harmonizing diverse clinical datasets. This paper presents a two-step data alignment strategy integrating ontologies and large language models (LLMs) to support secure, privacy-preserving FL in healthcare, demonstrating its effectiveness in a real-world project involving semantic mapping of EHR data.

View on arXiv
@article{kokash2025_2505.20020,
  title={ Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare },
  author={ Natallia Kokash and Lei Wang and Thomas H. Gillespie and Adam Belloum and Paola Grosso and Sara Quinney and Lang Li and Bernard de Bono },
  journal={arXiv preprint arXiv:2505.20020},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.