Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons

Consider the standard linear regression model , where is a response vector, is a design matrix, is the unknown regression vector, and is a Gaussian noise. Numerous work have been devoted to building efficient estimators of when is much larger than . In such a situation, a classical approach amounts to assume that is approximately sparse. This paper studies the minimax risks of estimation and testing over -sparse vectors . These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of ), the inverse problem (estimation of ) and linear testing (test of a linear hypothesis on ). Interestingly, an elbow effect occurs when the number of variables becomes larger than . Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in a ultra-high dimensional setting. Finally, the minimax risk are also studied under unknown variance . The knowledge of is shown to play a significant role in the optimal rates of estimation and testing.
View on arXiv