ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.11600
  4. Cited By
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning
  of Deep Neural Networks

ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks

23 February 2021
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
ArXivPDFHTML

Papers citing "ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks"

43 / 193 papers shown
Title
Improving Sharpness-Aware Minimization with Fisher Mask for Better
  Generalization on Language Models
Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
Qihuang Zhong
Liang Ding
Li Shen
Peng Mi
Juhua Liu
Bo Du
Dacheng Tao
AAML
30
50
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
  Approach
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
27
69
0
11 Oct 2022
SAM as an Optimal Relaxation of Bayes
SAM as an Optimal Relaxation of Bayes
Thomas Möllenhoff
Mohammad Emtiyaz Khan
BDL
37
32
0
04 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent
  Kernel
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sungyub Kim
Si-hun Park
Kyungsu Kim
Eunho Yang
BDL
29
4
0
30 Sep 2022
Relaxed Attention for Transformer Models
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Bootstrap Generalization Ability from Loss Landscape Perspective
Bootstrap Generalization Ability from Loss Landscape Perspective
Huanran Chen
Shitong Shao
Ziyi Wang
Zirui Shang
Jin Chen
Xiaofeng Ji
Xinxiao Wu
OOD
61
17
0
18 Sep 2022
Towards Bridging the Performance Gaps of Joint Energy-based Models
Towards Bridging the Performance Gaps of Joint Energy-based Models
Xiulong Yang
Qing Su
Shihao Ji
VLM
13
12
0
16 Sep 2022
Model Generalization: A Sharpness Aware Optimization Perspective
Model Generalization: A Sharpness Aware Optimization Perspective
Jozef Marus Coldenhoff
Chengkun Li
Yurui Zhu
15
2
0
14 Aug 2022
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep
  Models
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
ODL
35
150
0
13 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
23
2
0
11 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
21
1
0
31 Jul 2022
CrAM: A Compression-Aware Minimizer
CrAM: A Compression-Aware Minimizer
Alexandra Peste
Adrian Vladu
Eldar Kurtic
Christoph H. Lampert
Dan Alistarh
37
8
0
28 Jul 2022
PoF: Post-Training of Feature Extractor for Improving Generalization
PoF: Post-Training of Feature Extractor for Improving Generalization
Ikuro Sato
Ryota Yamada
Masayuki Tanaka
Nakamasa Inoue
Rei Kawakami
21
2
0
05 Jul 2022
Augment like there's no tomorrow: Consistently performing neural
  networks for medical imaging
Augment like there's no tomorrow: Consistently performing neural networks for medical imaging
J. Pohjonen
Carolin Sturenberg
Atte Fohr
Reija Randén-Brady
L. Luomala
J. Lohi
Esa Pitkanen
A. Rannikko
T. Mirtti
OOD
22
3
0
30 Jun 2022
Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries
Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries
Fabrizio Frasca
Beatrice Bevilacqua
Michael M. Bronstein
Haggai Maron
43
125
0
22 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
35
133
0
13 Jun 2022
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Minyoung Kim
Da Li
S. Hu
Timothy M. Hospedales
AAML
24
68
0
10 Jun 2022
Tackling covariate shift with node-based Bayesian neural networks
Tackling covariate shift with node-based Bayesian neural networks
Trung Trinh
Markus Heinonen
Luigi Acerbi
Samuel Kaski
BDL
UQCV
8
6
0
06 Jun 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
64
19
0
25 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
31
11
0
21 May 2022
EXACT: How to Train Your Accuracy
EXACT: How to Train Your Accuracy
I. Karpukhin
Stanislav Dereka
Sergey Kolesnikov
18
0
0
19 May 2022
Multimodal Transformer for Nursing Activity Recognition
Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz
Renato Diaz
Chong Chen
ViT
24
26
0
09 Apr 2022
TransGeo: Transformer Is All You Need for Cross-view Image
  Geo-localization
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Sijie Zhu
M. Shah
Chong Chen
ViT
24
147
0
31 Mar 2022
Improving Generalization in Federated Learning by Seeking Flat Minima
Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola
Barbara Caputo
Marco Ciccone
FedML
29
110
0
22 Mar 2022
Randomized Sharpness-Aware Training for Boosting Computational
  Efficiency in Deep Learning
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
18
9
0
18 Mar 2022
Surrogate Gap Minimization Improves Sharpness-Aware Training
Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang
Boqing Gong
Liangzhe Yuan
Huayu Chen
Hartwig Adam
Nicha Dvornek
S. Tatikonda
James Duncan
Ting Liu
24
146
0
15 Mar 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in
  Deep Learning
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
35
116
0
08 Feb 2022
When Do Flat Minima Optimizers Work?
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
24
58
0
01 Feb 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Generalized Wasserstein Dice Loss, Test-time Augmentation, and
  Transformers for the BraTS 2021 challenge
Generalized Wasserstein Dice Loss, Test-time Augmentation, and Transformers for the BraTS 2021 challenge
Lucas Fidon
Suprosanna Shit
Ivan Ezhov
Johannes C. Paetzold
Sébastien Ourselin
Tom Kamiel Magda Vercauteren
ViT
MedIm
38
8
0
24 Dec 2021
Unsupervised Dense Information Retrieval with Contrastive Learning
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
38
808
0
16 Dec 2021
Sharpness-Aware Minimization with Dynamic Reweighting
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
19
8
0
16 Dec 2021
Sharpness-aware Quantization for Deep Neural Networks
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
Arthur Douillard
Alexandre Ramé
Guillaume Couairon
Matthieu Cord
CLL
30
295
0
22 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary
  regime
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
31
4
0
07 Nov 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
133
98
0
16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Qiufeng Wang
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
113
132
0
07 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
42
5
0
01 Oct 2021
Where do Models go Wrong? Parameter-Space Saliency Maps for
  Explainability
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability
Roman Levin
Manli Shu
Eitan Borgnia
Furong Huang
Micah Goldblum
Tom Goldstein
FAtt
AAML
25
10
0
03 Aug 2021
A novel multi-scale loss function for classification problems in machine
  learning
A novel multi-scale loss function for classification problems in machine learning
L. Berlyand
Robert Creese
P. Jabin
17
3
0
04 Jun 2021
Descending through a Crowded Valley - Benchmarking Deep Learning
  Optimizers
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
40
162
0
03 Jul 2020
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,225
0
16 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Previous
1234