23
0

Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

Main:8 Pages
6 Figures
Bibliography:4 Pages
9 Tables
Appendix:4 Pages
Abstract

Deep neural networks have been increasingly used in safety-critical applications such as medical diagnosis and autonomous driving. However, many studies suggest that they are prone to being poorly calibrated and have a propensity for overconfidence, which may have disastrous consequences. In this paper, unlike standard training such as stochastic gradient descent, we show that the recently proposed sharpness-aware minimization (SAM) counteracts this tendency towards overconfidence. The theoretical analysis suggests that SAM allows us to learn models that are already well-calibrated by implicitly maximizing the entropy of the predictive distribution. Inspired by this finding, we further propose a variant of SAM, coined as CSAM, to ameliorate model calibration. Extensive experiments on various datasets, including ImageNet-1K, demonstrate the benefits of SAM in reducing calibration error. Meanwhile, CSAM performs even better than SAM and consistently achieves lower calibration error than other approaches

View on arXiv
@article{tan2025_2505.23866,
  title={ Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization },
  author={ Chengli Tan and Yubo Zhou and Haishan Ye and Guang Dai and Junmin Liu and Zengjie Song and Jiangshe Zhang and Zixiang Zhao and Yunda Hao and Yong Xu },
  journal={arXiv preprint arXiv:2505.23866},
  year={ 2025 }
}
Comments on this paper