Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.11600
Cited By
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
23 February 2021
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks"
43 / 193 papers shown
Title
Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
Qihuang Zhong
Liang Ding
Li Shen
Peng Mi
Juhua Liu
Bo Du
Dacheng Tao
AAML
30
50
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
27
69
0
11 Oct 2022
SAM as an Optimal Relaxation of Bayes
Thomas Möllenhoff
Mohammad Emtiyaz Khan
BDL
37
32
0
04 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sungyub Kim
Si-hun Park
Kyungsu Kim
Eunho Yang
BDL
29
4
0
30 Sep 2022
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Bootstrap Generalization Ability from Loss Landscape Perspective
Huanran Chen
Shitong Shao
Ziyi Wang
Zirui Shang
Jin Chen
Xiaofeng Ji
Xinxiao Wu
OOD
61
17
0
18 Sep 2022
Towards Bridging the Performance Gaps of Joint Energy-based Models
Xiulong Yang
Qing Su
Shihao Ji
VLM
13
12
0
16 Sep 2022
Model Generalization: A Sharpness Aware Optimization Perspective
Jozef Marus Coldenhoff
Chengkun Li
Yurui Zhu
15
2
0
14 Aug 2022
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
ODL
35
150
0
13 Aug 2022
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
23
2
0
11 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
21
1
0
31 Jul 2022
CrAM: A Compression-Aware Minimizer
Alexandra Peste
Adrian Vladu
Eldar Kurtic
Christoph H. Lampert
Dan Alistarh
37
8
0
28 Jul 2022
PoF: Post-Training of Feature Extractor for Improving Generalization
Ikuro Sato
Ryota Yamada
Masayuki Tanaka
Nakamasa Inoue
Rei Kawakami
21
2
0
05 Jul 2022
Augment like there's no tomorrow: Consistently performing neural networks for medical imaging
J. Pohjonen
Carolin Sturenberg
Atte Fohr
Reija Randén-Brady
L. Luomala
J. Lohi
Esa Pitkanen
A. Rannikko
T. Mirtti
OOD
22
3
0
30 Jun 2022
Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries
Fabrizio Frasca
Beatrice Bevilacqua
Michael M. Bronstein
Haggai Maron
43
125
0
22 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
35
133
0
13 Jun 2022
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Minyoung Kim
Da Li
S. Hu
Timothy M. Hospedales
AAML
24
68
0
10 Jun 2022
Tackling covariate shift with node-based Bayesian neural networks
Trung Trinh
Markus Heinonen
Luigi Acerbi
Samuel Kaski
BDL
UQCV
8
6
0
06 Jun 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
64
19
0
25 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
31
11
0
21 May 2022
EXACT: How to Train Your Accuracy
I. Karpukhin
Stanislav Dereka
Sergey Kolesnikov
18
0
0
19 May 2022
Multimodal Transformer for Nursing Activity Recognition
Momal Ijaz
Renato Diaz
Chong Chen
ViT
24
26
0
09 Apr 2022
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Sijie Zhu
M. Shah
Chong Chen
ViT
24
147
0
31 Mar 2022
Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola
Barbara Caputo
Marco Ciccone
FedML
29
110
0
22 Mar 2022
Randomized Sharpness-Aware Training for Boosting Computational Efficiency in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
18
9
0
18 Mar 2022
Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang
Boqing Gong
Liangzhe Yuan
Huayu Chen
Hartwig Adam
Nicha Dvornek
S. Tatikonda
James Duncan
Ting Liu
24
146
0
15 Mar 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
35
116
0
08 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
24
58
0
01 Feb 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Generalized Wasserstein Dice Loss, Test-time Augmentation, and Transformers for the BraTS 2021 challenge
Lucas Fidon
Suprosanna Shit
Ivan Ezhov
Johannes C. Paetzold
Sébastien Ourselin
Tom Kamiel Magda Vercauteren
ViT
MedIm
38
8
0
24 Dec 2021
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
38
808
0
16 Dec 2021
Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou
Fangyu Liu
Huan Zhang
Muhao Chen
AAML
19
8
0
16 Dec 2021
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion
Arthur Douillard
Alexandre Ramé
Guillaume Couairon
Matthieu Cord
CLL
30
295
0
22 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
31
4
0
07 Nov 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
133
98
0
16 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Qiufeng Wang
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
113
132
0
07 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
42
5
0
01 Oct 2021
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability
Roman Levin
Manli Shu
Eitan Borgnia
Furong Huang
Micah Goldblum
Tom Goldstein
FAtt
AAML
25
10
0
03 Aug 2021
A novel multi-scale loss function for classification problems in machine learning
L. Berlyand
Robert Creese
P. Jabin
17
3
0
04 Jun 2021
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M. Schmidt
Frank Schneider
Philipp Hennig
ODL
40
162
0
03 Jul 2020
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,225
0
16 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Previous
1
2
3
4