104
26

Empirical risk minimization is optimal for the convex aggregation problem

Abstract

Let FF be a finite model of cardinality MM and denote by conv(F)\operatorname {conv}(F) its convex hull. The problem of convex aggregation is to construct a procedure having a risk as close as possible to the minimal risk over conv(F)\operatorname {conv}(F). Consider the bounded regression model with respect to the squared risk denoted by R()R(\cdot). If f^nERMC{\widehat{f}}_n^{\mathit{ERM-C}} denotes the empirical risk minimization procedure over conv(F)\operatorname {conv}(F), then we prove that for any x>0x>0, with probability greater than 14exp(x)1-4\exp(-x), \[R({\widehat{f}}_n^{\mathit{ERM-C}})\leq\min_{f\in \operatorname {conv}(F)}R(f)+c_0\max \biggl(\psi_n^{(C)}(M),\frac{x}{n}\biggr),\] where c0>0c_0>0 is an absolute constant and ψn(C)(M)\psi_n^{(C)}(M) is the optimal rate of convex aggregation defined in (In Computational Learning Theory and Kernel Machines (COLT-2003) (2003) 303-313 Springer) by ψn(C)(M)=M/n\psi_n^{(C)}(M)=M/n when MnM\leq \sqrt{n} and ψn(C)(M)=log(eM/n)/n\psi_n^{(C)}(M)=\sqrt{\log (\mathrm{e}M/\sqrt{n})/n} when M>nM>\sqrt{n}.

View on arXiv
Comments on this paper