Gradient and Newton Boosting for Classification and Regression

Boosting algorithms show high predictive accuracy on a wide array of datasets. In this article, we present gradient and Newton boosting, as well as a hybrid variant of the two, in a unified framework. In addition, we introduce a novel tuning parameter for tree-based Newton boosting which is important for predictive accuracy. We compare the different boosting algorithms with trees as base learners on a wide range of datasets using various choices of loss functions, and we find that Newton boosting outperforms gradient and also hybrid gradient-Newton boosting on the majority of datasets. Further, we present empirical evidence that this difference in predictive accuracy is not primarily due to faster convergence of Newton boosting, but rather due to the fact that Newton boosting often achieves a lower training loss while at the same time having higher out-of-sample predictive accuracy.
View on arXiv