Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks

5 May 2025

Juyoung Yun

ArXiv (abs)PDF HTML

Main:10 Pages

5 Figures

Bibliography:3 Pages

3 Tables

Abstract

Generalizing well in deep neural networks remains a core challenge, particularly due to their tendency to converge to sharp minima that degrade robustness. Sharpness-Aware Minimization (SAM) mitigates this by seeking flatter minima but perturbs parameters using the full gradient, which can include statistically insignificant directions. We propose ZSharp, a simple yet effective extension to SAM that applies layer-wise Z-score normalization followed by percentile-based filtering to retain only statistically significant gradient components. This selective perturbation aligns updates with curvature-sensitive directions, enhancing generalization without requiring architectural changes. ZSharp introduces only one additional hyperparameter, the percentile threshold, and remains fully compatible with existing SAM variants. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet using ResNet, VGG, and Vision Transformers show that ZSharp consistently outperforms SAM and its variants in test accuracy, particularly on deeper and transformer-based models. These results demonstrate that ZSharp is a principled and lightweight improvement for sharpness-aware optimization.

View on arXiv

@article{yun2025_2505.02369,
  title={ Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks },
  author={ Juyoung Yun },
  journal={arXiv preprint arXiv:2505.02369},
  year={ 2025 }
}

Comments on this paper