6
0

Sufficient Dimension Reduction and Modeling Responses Conditioned on Covariates: An Integrated Approach via Convex Optimization

Abstract

Given observations of a collection of covariates and responses (Y,X)Rp×Rq(Y, X) \in \mathbb{R}^p \times \mathbb{R}^q, sufficient dimension reduction (SDR) techniques aim to identify a mapping f:RqRkf: \mathbb{R}^q \rightarrow \mathbb{R}^k with kqk \ll q such that Yf(X)Y|f(X) is independent of XX. The image f(X)f(X) summarizes the relevant information in a potentially large number of covariates XX that influence the responses YY. In many contemporary settings, the number of responses pp is also quite large, in addition to a large number qq of covariates. This leads to the challenge of fitting a succinctly parameterized statistical model to Yf(X)Y|f(X), which is a problem that is usually not addressed in a traditional SDR framework. In this paper, we present a computationally tractable convex relaxation based estimator for simultaneously (a) identifying a linear dimension reduction f(X)f(X) of the covariates that is sufficient with respect to the responses, and (b) fitting several types of structured low-dimensional models -- factor models, graphical models, latent-variable graphical models -- to the conditional distribution of Yf(X)Y|f(X). We analyze the consistency properties of our estimator in a high-dimensional scaling regime. We also illustrate the performance of our approach on a newsgroup dataset and on a dataset consisting of financial asset prices.

View on arXiv
Comments on this paper