78
85

Stable Graphical Model Estimation with Random Forests for Discrete, Continuous, and Mixed Variables

Abstract

A conditional independence graph is a concise representation of pairwise conditional independence among many variables. We propose Graphical Random Forests (GRaFo) for estimating pairwise conditional independence relationships among mixed-type, i.e. continuous and discrete, variables. The number of edges is a tuning parameter in any graphical model estimator and there is no obvious number that constitutes a good choice. Stability Selection helps choosing this parameter with respect to a bound on the expected number of false positives (error control). We evaluate and compare the performance of GRaFo with Stable LASSO (StabLASSO), a LASSO-based alternative, across 5 simulated settings with p=50, 100, and 200 variables, and we apply GRaFo to data from the Swiss Health Survey in order to evaluate how well we can reproduce the interconnection of functional health components, personal, and environmental factors, as hypothesized by the World Health Organization's International Classification of Functioning, Disability and Health (ICF). GRaFo performs well with mixed data and thanks to Stability Selection it provides an error control mechanism for false positive selection.

View on arXiv
Comments on this paper