Nonparametric Bayesian sparse factor models with application to gene expression modeling

29 November 2010

Abstract

A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$ , of a potentially infinite number of hidden factors, $\mathbf{X}$ . The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.

View on arXiv

Comments on this paper