11
2

Lockout: Sparse Regularization of Neural Networks

Abstract

Many regression and classification procedures fit a parameterized function f(x;w)f(x;w) of predictor variables xx to data {xi,yi}1N\{x_{i},y_{i}\}_1^N based on some loss criterion L(y,f)L(y,f). Often, regularization is applied to improve accuracy by placing a constraint P(w)tP(w)\leq t on the values of the parameters ww. Although efficient methods exist for finding solutions to these constrained optimization problems for all values of t0t\geq0 in the special case when ff is a linear function, none are available when ff is non-linear (e.g. Neural Networks). Here we present a fast algorithm that provides all such solutions for any differentiable function ff and loss LL, and any constraint PP that is an increasing monotone function of the absolute value of each parameter. Applications involving sparsity inducing regularization of arbitrary Neural Networks are discussed. Empirical results indicate that these sparse solutions are usually superior to their dense counterparts in both accuracy and interpretability. This improvement in accuracy can often make Neural Networks competitive with, and sometimes superior to, state-of-the-art methods in the analysis of tabular data.

View on arXiv
Comments on this paper