161
283

Smoothness, Low-Noise and Fast Rates

Abstract

We establish am excess risk bound of order H\Radn2+HL\RadnH \Rad_n^2 + \sqrt{H L^*}\Rad_n for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity \Radn\Rad_n, where LL^* is the best risk achievable by the hypothesis class. For typical hypothesis classes where \Radn=R/n\Rad_n = \sqrt{R/n}, this translates to a learning rate of order RH/nRH/n in the separable (L=0L^*=0) case and RH/n+LRH/nRH/n + \sqrt{L^* RH/n} more generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth non-negative objective.

View on arXiv
Comments on this paper