13
18

Interpretable, similarity-driven multi-view embeddings from high-dimensional biomedical data

Abstract

Similarity-driven multi-view linear reconstruction (SiMLR) is an algorithm that exploits inter-modality relationships to transform large scientific datasets into smaller, more well-powered and interpretable low-dimensional spaces. SiMLR contributes a novel objective function for identifying joint signal, regularization based on sparse matrices representing prior within-modality relationships and an implementation that permits application to joint reduction of large data matrices, each of which may have millions of entries. We demonstrate that SiMLR outperforms closely related methods on supervised learning problems in simulation data, a multi-omics cancer survival prediction dataset and multiple modality neuroimaging datasets. Taken together, this collection of results shows that SiMLR may be applied with default parameters to joint signal estimation from disparate modalities and may yield practically useful results in a variety of application domains.

View on arXiv
Comments on this paper