We study the cost of parallelizing weak-to-strong boosting algorithms for learning, following the recent work of Karbasi and Larsen. Our main results are two-fold: - First, we prove a tight lower bound, showing that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training. Specifically, let be the weak learner's advantage over random guessing. The famous \textsc{AdaBoost} algorithm produces an accurate hypothesis by interacting with the weak learner for rounds where each round runs in polynomial time. Karbasi and Larsen showed that "significant" parallelization must incur exponential blow-up: Any boosting algorithm either interacts with the weak learner for rounds or incurs an blow-up in the complexity of training, where is the VC dimension of the hypothesis class. We close the gap by showing that any boosting algorithm either has rounds of interaction or incurs a smaller exponential blow-up of . -Complementing our lower bound, we show that there exists a boosting algorithm using rounds, and only suffer a blow-up of . Plugging in , this shows that the smaller blow-up in our lower bound is tight. More interestingly, this provides the first trade-off between the parallelism and the total work required for boosting.
View on arXiv