123
15

Graphically dependent and spatially varying Dirichlet process mixtures

Abstract

We consider the problem of clustering grouped and functional data, which are indexed by a covariate, and assessing the dependency of the clustered groups on the covariate. We assume that each observation within a group is a draw from a mixture model. The mixture components and the number of such components can change with the covariate, and are assumed to be unknown a priori. In addition to learning the "local" clusters within each group we also assume the existence of "global clusters" indexed over the covariate domain when the observations across the groups are jointly analyzed. The number of global clusters is also unknown and to be inferred from the data. We propose a nonparametric Bayesian solution to this problem, reposing on the theory of dependent Dirichlet processes, where the dependency among the Dirichlet processes is regulated by a spatial or a graphical model distribution indexed by the covariate. In our proposed model, the global clusters are supported by a Dirichlet process, while the local clusters are randomly selected using another hierarchy of Dirichlet processes. We provide an analysis of the model properties, including a stick-breaking and a P\'olya-urn scheme characterization. The graphical and spatial dependency are investigated, along with a discussion of the model identifiability. We present MCMC sampling methods, and discuss the computational implications of using a spatial or a graphical model distribution as the base measure in our model. Finally, the model behavior and inference algorithm are demonstrated by several data examples, including a clustering analysis of the progesterone hormone data.

View on arXiv
Comments on this paper