The global prevalence of diabetes, particularly type 2 diabetes mellitus (T2DM), is rapidly increasing, posing significant health and economic challenges. T2DM not only disrupts blood glucose regulation but also damages vital organs such as the heart, kidneys, eyes, nerves, and blood vessels, leading to substantial morbidity and mortality. In the US alone, the economic burden of diagnosed diabetes exceeded \400billionin2022.Earlydetectionofindividualsatriskiscriticaltomitigatingtheseimpacts.WhilemachinelearningapproachesforT2DMpredictionareincreasinglyadopted,manyrelyonsupervisedlearning,whichisoftenlimitedbythelackofconfirmednegativecases.Toaddressthislimitation,weproposeanovelunsupervisedframeworkthatintegratesNon−negativeMatrixFactorization(NMF)withstatisticaltechniquestoidentifyindividualsatriskofdevelopingT2DM.OurmethodidentifieslatentpatternsofmultimorbidityandpolypharmacyamongdiagnosedT2DMpatientsandappliesthesepatternstoestimatetheT2DMriskinundiagnosedindividuals.Byleveragingdata−driveninsightsfromcomorbidityandmedicationusage,ourapproachprovidesaninterpretableandscalablesolutionthatcanassisthealthcareprovidersinimplementingtimelyinterventions,ultimatelyimprovingpatientoutcomesandpotentiallyreducingthefuturehealthandeconomicburdenofT2DM.
@article{kumar2025_2505.21824,
title={ Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations },
author={ Praveen Kumar and Vincent T. Metzger and Scott A. Malec },
journal={arXiv preprint arXiv:2505.21824},
year={ 2025 }
}