We obtain bounds on estimation error rates for regularization procedures of the form \begin{equation*} \hat f \in {\rm argmin}_{f\in F}\left(\frac{1}{N}\sum_{i=1}^N\left(Y_i-f(X_i)\right)^2+\lambda \Psi(f)\right) \end{equation*} when is a norm and is convex. Our approach gives a common framework that may be used in the analysis of learning problems and regularization problems alike. In particular, it sheds some light on the role various notions of sparsity have in regularization and on their connection with the size of subdifferentials of in a neighbourhood of the true minimizer. As `proof of concept' we extend the known estimates for the LASSO, SLOPE and trace norm regularization.
View on arXiv