Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

29 May 2025

Main:8 Pages

6 Figures

Bibliography:4 Pages

9 Tables

Appendix:4 Pages

Abstract

Deep neural networks have been increasingly used in safety-critical applications such as medical diagnosis and autonomous driving. However, many studies suggest that they are prone to being poorly calibrated and have a propensity for overconfidence, which may have disastrous consequences. In this paper, unlike standard training such as stochastic gradient descent, we show that the recently proposed sharpness-aware minimization (SAM) counteracts this tendency towards overconfidence. The theoretical analysis suggests that SAM allows us to learn models that are already well-calibrated by implicitly maximizing the entropy of the predictive distribution. Inspired by this finding, we further propose a variant of SAM, coined as CSAM, to ameliorate model calibration. Extensive experiments on various datasets, including ImageNet-1K, demonstrate the benefits of SAM in reducing calibration error. Meanwhile, CSAM performs even better than SAM and consistently achieves lower calibration error than other approaches

View on arXiv

@article{tan2025_2505.23866,
  title={ Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization },
  author={ Chengli Tan and Yubo Zhou and Haishan Ye and Guang Dai and Junmin Liu and Zengjie Song and Jiangshe Zhang and Zixiang Zhao and Yunda Hao and Yong Xu },
  journal={arXiv preprint arXiv:2505.23866},
  year={ 2025 }
}

Comments on this paper