Gradient and Newton Boosting for Classification and Regression

9 August 2018

Abstract

Boosting algorithms show high predictive accuracy on a wide array of datasets. To date, the distinction between boosting with either gradient descent or second-order updates is often not made, and it is thus implicitly assumed that the difference is irrelevant. In this article, we present gradient and Newton boosting, as well as a hybrid variant of the two, in a unified framework. We compare theses boosting algorithms with trees as base learners on a large set of regression and classification datasets using various choices of loss functions. Our experiments show that Newton boosting outperforms gradient and hybrid gradient-Newton boosting in terms of predictive accuracy on the majority of datasets. Further, we present empirical evidence that this difference in predictive accuracy is not primarily due to faster convergence of Newton boosting, but rather since Newton boosting often achieves lower test errors while at the same time having lower training losses. In addition, we introduce a novel tuning parameter for tree-based Newton boosting which is interpretable and important for predictive accuracy.

View on arXiv

Comments on this paper