The Rate of Convergence of AdaBoost

29 June 2011

Abstract

The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration $t$ , the exponential loss of AdaBoost's computed parameter vector will be at most $\epsilon$ more than that of any parameter vector of $\ell_1$ -norm bounded by $B$ in a number of rounds that is at most a polynomial in $B$ and $1/\epsilon$ . We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within $C/\epsilon$ iterations, AdaBoost achieves a value of the exponential loss that is at most $\epsilon$ more than the best possible value, where $C$ depends on the dataset. We show that this dependence of the rate on $\epsilon$ is optimal up to constant factors, i.e., at least $\Omega(1/\epsilon)$ rounds are necessary to achieve within $\epsilon$ of the optimal exponential loss.

View on arXiv

Comments on this paper