27
20

Optimal prediction in the linearly transformed spiked model

Abstract

We consider the linearly transformed spiked model, where observations YiY_i are noisy linear transforms of unobserved signals of interest XiX_i: \begin{align*} Y_i = A_i X_i + \varepsilon_i, \end{align*} for i=1,,ni=1,\ldots,n. The transform matrices AiA_i are also observed. We model XiX_i as random vectors lying on an unknown low-dimensional space. How should we predict the unobserved signals (regression coefficients) XiX_i? The naive approach of performing regression for each observation separately is inaccurate due to the large noise. Instead, we develop optimal linear empirical Bayes methods for predicting XiX_i by "borrowing strength" across the different samples. Our methods are applicable to large datasets and rely on weak moment assumptions. The analysis is based on random matrix theory. We discuss applications to signal processing, deconvolution, cryo-electron microscopy, and missing data in the high-noise regime. For missing data, we show in simulations that our methods are faster, more robust to noise and to unequal sampling than well-known matrix completion methods.

View on arXiv
Comments on this paper