Representation Learning for Medical Data

Abstract
We propose a representation learning framework for medical diagnosis domain. It is based on heterogeneous network-based model of diagnostic data as well as modified metapath2vec algorithm for learning latent node representation. We compare the proposed algorithm with other representation learning methods in two practical case studies: symptom/disease classification and disease prediction. We observe a significant performance boost in these task resulting from learning representations of domain data in a form of heterogeneous network.
View on arXivComments on this paper