High-dimensional inference via hybrid orthogonalization

The past decade has witnessed a surge of endeavors in statistical inference for high-dimensional sparse regression, particularly via de-biasing or relaxed orthogonalization. Nevertheless, these techniques typically require a more stringent sparsity condition than needed for estimation consistency, which seriously limits their practical applicability. To alleviate such constraint, we propose to exploit the identifiable features to residualize the design matrix before performing debiasing-based inference over the parameters of interest. This leads to a hybrid orthogonalization (HOT) technique that performs strict orthogonalization against the identifiable features but relaxed orthogonalization against the others. Under an approximately sparse model with a mixture of identifiable and unidentifiable signals, we establish the asymptotic normality of the HOT test statistic while accommodating as many identifiable signals as consistent estimation allows. The efficacy of the proposed test is also demonstrated through simulation and analysis of a stock market dataset.
View on arXiv