The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exponential loss." Unlike previous work, our proofs do not require a weak-learning assumption, nor do they require that minimizers of the exponential loss are finite. Our first result shows that at iteration , the exponential loss of AdaBoost's computed parameter vector will be at most more than that of any parameter vector of -norm bounded by in a number of rounds that is at most a polynomial in and . We also provide lower bounds showing that a polynomial dependence on these parameters is necessary. Our second result is that within iterations, AdaBoost achieves a value of the exponential loss that is at most more than the best possible value, where depends on the dataset. We show that this dependence of the rate on is optimal up to constant factors, i.e., at least rounds are necessary to achieve within of the optimal exponential loss.
View on arXiv