17
0

M-estimation with the Trimmed l1 Penalty

Abstract

We study high-dimensional estimators with the trimmed 1\ell_1 penalty, which leaves the hh largest parameter entries penalty-free. While optimization techniques for this nonconvex penalty have been studied, the statistical properties have not yet been analyzed. We present the first statistical analyses for MM-estimation and characterize support recovery, \ell_\infty and 2\ell_2 error of the trimmed 1\ell_1 estimates as a function of the trimming parameter hh. Our results show different regimes based on how hh compares to the true support size. Our second contribution is a new algorithm for the trimmed regularization problem, which has the same theoretical convergence rate as the difference of convex (DC) algorithms, but in practice is faster and finds lower objective values. Empirical evaluation of 1\ell_1 trimming for sparse linear regression and graphical model estimation indicate that trimmed 1\ell_1 can outperform vanilla 1\ell_1 and non-convex alternatives. Our last contribution is to show that the trimmed penalty is beneficial beyond MM-estimation, and yields promising results for two deep learning tasks: input structures recovery and network sparsification.

View on arXiv
Comments on this paper