396
0

SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation

Abstract

Approximate second-order optimization methods often exhibit poorer generalization compared to first-order approaches. In this work, we look into this issue through the lens of the loss landscape and find that existing second-order methods tend to converge to sharper minima compared to SGD. In response, we propose Sassha, a novel second-order method designed to enhance generalization by explicitly reducing sharpness of the solution, while stabilizing the computation of approximate Hessians along the optimization trajectory. In fact, this sharpness minimization scheme is crafted also to accommodate lazy Hessian updates, so as to secure efficiency besides flatness. To validate its effectiveness, we conduct a wide range of standard deep learning experiments where Sassha demonstrates its outstanding generalization performance that is comparable to, and mostly better than, other methods. We provide a comprehensive set of analyses including convergence, robustness, stability, efficiency, and cost.

View on arXiv
@article{shin2025_2502.18153,
  title={ SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation },
  author={ Dahun Shin and Dongyeop Lee and Jinseok Chung and Namhoon Lee },
  journal={arXiv preprint arXiv:2502.18153},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.