27
0

Tight bounds for maximum 1\ell_1-margin classifiers

Abstract

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum 1\ell_1-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the 1\ell_1-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum 1\ell_1-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order w12/3n1/3\frac{\|w^*\|_1^{2/3}}{n^{1/3}} for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order 1log(d/n)\frac{1}{\sqrt{\log(d/n)}}. We are therefore first to show benign overfitting for the maximum 1\ell_1-margin classifier.

View on arXiv
Comments on this paper