To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in
Transfer Learning
Neural Information Processing Systems (NeurIPS), 2023
Main:10 Pages
23 Figures
Bibliography:4 Pages
9 Tables
Appendix:15 Pages
Abstract
Transfer learning and ensembling are two popular techniques for improving the performance and robustness of neural networks. Due to the high cost of pre-training, ensembles of models fine-tuned from a single pre-trained checkpoint are often used in practice. Such models end up in the same basin of the loss landscape and thus have limited diversity. In this work, we study if it is possible to improve ensembles trained from a single pre-trained checkpoint by better exploring the pre-train basin or a close vicinity outside of it. We show that while exploration of the pre-train basin may be beneficial for the ensemble, leaving the basin results in losing the benefits of transfer learning and degradation of the ensemble quality.
View on arXivComments on this paper
