On optimality of empirical risk minimization in linear aggregation

In the first part of this paper, we show that the small-ball condition, recently introduced by Mendelson (2015), may behave poorly for important classes of localized functions such as wavelets, piecewise polynomials or trigonometric polynomials, in particular leading to suboptimal estimates of the rate of convergence of ERM for the linear aggregation problem. In a second part, we recover optimal rates of covergence for the excess risk of ERM when the dictionary is made of trigonometric functions. Considering the bounded case, we derive the concentration of the excess risk around a single point, which is an information far more precise than the rate of convergence. In the general setting of a L2 noise, we finally refine the small ball argument by rightly selecting the directions we are looking at, in such a way that we obtain optimal rates of aggregation for the Fourier dictionary.
View on arXiv