ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.02369
  4. Cited By
Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks
v1v2v3 (latest)

Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks

5 May 2025
Juyoung Yun
ArXiv (abs)PDFHTML

Papers citing "Sharpness-Aware Minimization with Z-Score Gradient Filtering for Neural Networks"

19 / 19 papers shown
Title
Mitigating Gradient Overlap in Deep Residual Networks with Gradient
  Normalization for Improved Non-Convex Optimization
Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization
Juyoung Yun
44
3
0
28 Oct 2024
Friendly Sharpness-Aware Minimization
Friendly Sharpness-Aware Minimization
Tao Li
Pan Zhou
Zhengbao He
Xinwen Cheng
Xiaolin Huang
AAML
84
17
0
19 Mar 2024
Robust Neural Pruning with Gradient Sampling Optimization for Residual
  Neural Networks
Robust Neural Pruning with Gradient Sampling Optimization for Residual Neural Networks
Juyoung Yun
61
1
0
26 Dec 2023
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization
  for Improved Generalization
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization
Zhiyuan Zhang
Ruixuan Luo
Qi Su
Xueting Sun
105
13
0
13 Oct 2022
When Vision Transformers Outperform ResNets without Pre-training or
  Strong Data Augmentations
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
112
330
0
03 Jun 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
128
291
0
23 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
780
41,945
0
22 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
251
1,363
0
03 Oct 2020
Gradient Centralization: A New Optimization Technique for Deep Neural
  Networks
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong
Jianqiang Huang
Xiansheng Hua
Lei Zhang
ODL
100
188
0
03 Apr 2020
Benign Overfitting in Linear Regression
Benign Overfitting in Linear Regression
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
135
780
0
26 Jun 2019
Exploring Generalization in Deep Learning
Exploring Generalization in Deep Learning
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
FAtt
205
1,261
0
27 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
986
133,586
0
12 Jun 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
374
4,641
0
10 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
568
2,948
0
15 Sep 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
450
10,570
0
21 Jul 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.7K
195,297
0
10 Dec 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
802
43,419
0
11 Feb 2015
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAttMDE
2.0K
100,832
0
04 Sep 2014
On the difficulty of training Recurrent Neural Networks
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
310
5,370
0
21 Nov 2012
1