To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Neural Information Processing Systems (NeurIPS), 2023

6 March 2023

Ildus Sadrtdinov

Dmitrii Pozdeev

Dmitry Vetrov

E. Lobacheva

ArXiv (abs)PDF HTML Github (7★)

Main:10 Pages

23 Figures

Bibliography:4 Pages

9 Tables

Appendix:15 Pages

Abstract

Transfer learning and ensembling are two popular techniques for improving the performance and robustness of neural networks. Due to the high cost of pre-training, ensembles of models fine-tuned from a single pre-trained checkpoint are often used in practice. Such models end up in the same basin of the loss landscape and thus have limited diversity. In this work, we study if it is possible to improve ensembles trained from a single pre-trained checkpoint by better exploring the pre-train basin or a close vicinity outside of it. We show that while exploration of the pre-train basin may be beneficial for the ensemble, leaving the basin results in losing the benefits of transfer learning and degradation of the ensemble quality.

View on arXiv

Comments on this paper