Efficient Statistics for Sparse Graphical Models from Truncated Samples

In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For Gaussian graphical models, suppose -dimensional samples are generated from a Gaussian and observed only if they belong to a subset . We show that and can be estimated with error in the Frobenius norm, using samples from a truncated and having access to a membership oracle for . The set is assumed to have non-trivial measure under the unknown distribution but is otherwise arbitrary. (ii) For sparse linear regression, suppose samples are generated where and is seen only if belongs to a truncation set . We consider the case that is sparse with a support set of size . Our main result is to establish precise conditions on the problem dimension , the support size , the number of observations , and properties of the samples and the truncation that are sufficient to recover the support of . Specifically, we show that under some mild assumptions, only samples are needed to estimate in the -norm up to a bounded error. For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and an -regularization term.
View on arXiv