ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.01412
  4. Cited By
Sharpness-Aware Minimization for Efficiently Improving Generalization

Sharpness-Aware Minimization for Efficiently Improving Generalization

3 October 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
    AAML
ArXivPDFHTML

Papers citing "Sharpness-Aware Minimization for Efficiently Improving Generalization"

50 / 867 papers shown
Title
Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware
  Minimization
Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization
Mingli Zhu
Shaokui Wei
Li Shen
Yanbo Fan
Baoyuan Wu
AAML
41
51
0
24 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Tao Luo
Ruimao Zhang
Shuguang Cui
Zhuguo Li
27
5
0
23 Apr 2023
Decoupled Training for Long-Tailed Classification With Stochastic
  Representations
Decoupled Training for Long-Tailed Classification With Stochastic Representations
G. Nam
Sunguk Jang
Juho Lee
OOD
BDL
OODD
30
13
0
19 Apr 2023
OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
  Shifts of Individual Nuisances in Natural Images
OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images
Bingchen Zhao
Jiahao Wang
Wufei Ma
Artur Jesslen
Si-Jia Yang
Shaozuo Yu
O. Zendel
Christian Theobalt
Alan Yuille
Adam Kortylewski
37
8
0
17 Apr 2023
Fairness in AI and Its Long-Term Implications on Society
Fairness in AI and Its Long-Term Implications on Society
Ondrej Bohdal
Timothy M. Hospedales
Philip Torr
Fazl Barez
15
4
0
16 Apr 2023
Assessment Framework for Deepfake Detection in Real-world Situations
Assessment Framework for Deepfake Detection in Real-world Situations
Yuhang Lu
Touradj Ebrahimi
CVBM
32
17
0
12 Apr 2023
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of
  Inductive Biases in Machine Learning
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Micah Goldblum
Marc Finzi
K. Rowan
A. Wilson
UQCV
FedML
26
38
0
11 Apr 2023
Simulated Annealing in Early Layers Leads to Better Generalization
Simulated Annealing in Early Layers Leads to Better Generalization
Amirm. Sarfi
Zahra Karimpour
Muawiz Chaudhary
N. Khalid
Mirco Ravanelli
Sudhir Mudur
Eugene Belilovsky
AI4CE
CLL
23
7
0
10 Apr 2023
Exploring the Connection between Robust and Generative Models
Exploring the Connection between Robust and Generative Models
Senad Beadini
I. Masi
AAML
32
1
0
08 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
On the Pareto Front of Multilingual Neural Machine Translation
On the Pareto Front of Multilingual Neural Machine Translation
Liang Chen
Shuming Ma
Dongdong Zhang
Furu Wei
Baobao Chang
MoE
23
5
0
06 Apr 2023
Domain Generalization with Adversarial Intensity Attack for Medical
  Image Segmentation
Domain Generalization with Adversarial Intensity Attack for Medical Image Segmentation
Zheyu Zhang
Bin Wang
Lanhong Yao
Ugur Demir
Debesh Jha
I. Turkbey
Boqing Gong
Ulas Bagci
AAML
MedIm
OOD
32
11
0
05 Apr 2023
Going Further: Flatness at the Rescue of Early Stopping for Adversarial
  Example Transferability
Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability
Martin Gubri
Maxime Cordy
Yves Le Traon
AAML
20
3
1
05 Apr 2023
Few-shot Fine-tuning is All You Need for Source-free Domain Adaptation
Few-shot Fine-tuning is All You Need for Source-free Domain Adaptation
Suho Lee
SeungWon Seo
Jihyo Kim
Yejin Lee
Sangheum Hwang
24
5
0
03 Apr 2023
Per-Example Gradient Regularization Improves Learning Signals from Noisy
  Data
Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Xuran Meng
Yuan Cao
Difan Zou
25
5
0
31 Mar 2023
Impact of Video Processing Operations in Deepfake Detection
Impact of Video Processing Operations in Deepfake Detection
Yuhang Lu
Touradj Ebrahimi
CVBM
27
3
0
30 Mar 2023
Self-accumulative Vision Transformer for Bone Age Assessment Using the
  Sauvegrain Method
Self-accumulative Vision Transformer for Bone Age Assessment Using the Sauvegrain Method
Hong-Jun Choi
Dongbin Na
Kyungjin Cho
Byunguk Bae
Seo Taek Kong
Hyun-Suk An
16
0
0
29 Mar 2023
Generalization Matters: Loss Minima Flattening via Parameter
  Hybridization for Efficient Online Knowledge Distillation
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Mingli Song
Mingli Song
28
5
0
26 Mar 2023
Robust Generalization against Photon-Limited Corruptions via Worst-Case
  Sharpness Minimization
Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
Zhuo Huang
Miaoxi Zhu
Xiaobo Xia
Li Shen
Jun Yu
Chen Gong
Bo Han
Bo Du
Tongliang Liu
38
33
0
23 Mar 2023
Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation
Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation
Fabian Deuser
Konrad Habel
Norbert Oswald
27
54
0
21 Mar 2023
Make Landscape Flatter in Differentially Private Federated Learning
Make Landscape Flatter in Differentially Private Federated Learning
Yi Shi
Yingqi Liu
Kang Wei
Li Shen
Xueqian Wang
Dacheng Tao
FedML
25
54
0
20 Mar 2023
A hybrid CNN-RNN approach for survival analysis in a Lung Cancer
  Screening study
A hybrid CNN-RNN approach for survival analysis in a Lung Cancer Screening study
Yaozhi Lu
S. Aslani
A. Zhao
Ahmed H. Shahin
D. Barber
M. Emberton
Daniel C. Alexander
Joseph Jacob
45
6
0
19 Mar 2023
Randomized Adversarial Training via Taylor Expansion
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
44
34
0
19 Mar 2023
Sharpness-Aware Gradient Matching for Domain Generalization
Sharpness-Aware Gradient Matching for Domain Generalization
Pengfei Wang
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
19
83
0
18 Mar 2023
Diffusion-based Target Sampler for Unsupervised Domain Adaptation
Diffusion-based Target Sampler for Unsupervised Domain Adaptation
Yulong Zhang
Shuhao Chen
Yu Zhang
Jiangang Lu
DiffM
42
0
0
17 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
28
56
0
16 Mar 2023
Allegro-Legato: Scalable, Fast, and Robust Neural-Network Quantum
  Molecular Dynamics via Sharpness-Aware Minimization
Allegro-Legato: Scalable, Fast, and Robust Neural-Network Quantum Molecular Dynamics via Sharpness-Aware Minimization
Hikaru Ibayashi
Taufeq Mohammed Razakh
Liqiu Yang
T. Linker
M. Olguin
...
Ye Luo
R. Kalia
A. Nakano
K. Nomura
P. Vashishta
43
9
0
14 Mar 2023
Domain Generalization via Nuclear Norm Regularization
Domain Generalization via Nuclear Norm Regularization
Zhenmei Shi
Yifei Ming
Ying Fan
Frederic Sala
Yingyu Liang
30
12
0
13 Mar 2023
Stabilizing Transformer Training by Preventing Attention Entropy
  Collapse
Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Shuangfei Zhai
Tatiana Likhomanenko
Etai Littwin
Dan Busbridge
Jason Ramapuram
Yizhe Zhang
Jiatao Gu
J. Susskind
AAML
46
65
0
11 Mar 2023
Loss-Curvature Matching for Dataset Selection and Condensation
Loss-Curvature Matching for Dataset Selection and Condensation
Seung-Jae Shin
Heesun Bae
DongHyeok Shin
Weonyoung Joo
Il-Chul Moon
DD
49
24
0
08 Mar 2023
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
  Approach
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach
Zhimeng Jiang
Xiaotian Han
Hongye Jin
Guanchu Wang
Rui Chen
Na Zou
Xia Hu
12
13
0
06 Mar 2023
Rethinking Confidence Calibration for Failure Prediction
Rethinking Confidence Calibration for Failure Prediction
Fei Zhu
Zhen Cheng
Xu-Yao Zhang
Cheng-Lin Liu
UQCV
22
39
0
06 Mar 2023
DiTTO: A Feature Representation Imitation Approach for Improving
  Cross-Lingual Transfer
DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer
Shanu Kumar
Abbaraju Soujanya
Sandipan Dandapat
Sunayana Sitaram
Monojit Choudhury
VLM
33
1
0
04 Mar 2023
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
Yihua Zhang
Pranay Sharma
Parikshit Ram
Min-Fong Hong
Kush R. Varshney
Sijia Liu
34
11
0
04 Mar 2023
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves
  Generalization
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Hao Zou
Peng Cui
21
39
0
03 Mar 2023
Deep Neural Networks with Efficient Guaranteed Invariances
Deep Neural Networks with Efficient Guaranteed Invariances
M. Rath
A. P. Condurache
18
4
0
02 Mar 2023
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning
  Rate and Momentum for Training Deep Neural Networks
AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Hao Sun
Li Shen
Qihuang Zhong
Liang Ding
Shi-Yong Chen
Jingwei Sun
Jing Li
Guangzhong Sun
Dacheng Tao
49
31
0
01 Mar 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of
  Neural Networks
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain
Sravanti Addepalli
P. Sahu
Priyam Dey
R. Venkatesh Babu
MoMe
OOD
43
20
0
28 Feb 2023
FedCLIP: Fast Generalization and Personalization for CLIP in Federated
  Learning
FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning
Wang Lu
Xixu Hu
Jindong Wang
Xingxu Xie
FedML
VLM
30
52
0
27 Feb 2023
On the Training Instability of Shuffling SGD with Batch Normalization
On the Training Instability of Shuffling SGD with Batch Normalization
David Wu
Chulhee Yun
S. Sra
32
4
0
24 Feb 2023
Towards Stable Test-Time Adaptation in Dynamic Wild World
Towards Stable Test-Time Adaptation in Dynamic Wild World
Shuaicheng Niu
Jiaxiang Wu
Yifan Zhang
Z. Wen
Yaofo Chen
P. Zhao
Mingkui Tan
TTA
35
248
0
24 Feb 2023
Less is More: Data Pruning for Faster Adversarial Training
Less is More: Data Pruning for Faster Adversarial Training
Yize Li
Pu Zhao
X. Lin
B. Kailkhura
Ryan Goldh
AAML
15
9
0
23 Feb 2023
Phase diagram of early training dynamics in deep neural networks: effect
  of the learning rate, depth, and width
Phase diagram of early training dynamics in deep neural networks: effect of the learning rate, depth, and width
Dayal Singh Kalra
M. Barkeshli
15
9
0
23 Feb 2023
What Can We Learn From The Selective Prediction And Uncertainty
  Estimation Performance Of 523 Imagenet Classifiers
What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
38
24
0
23 Feb 2023
On Statistical Properties of Sharpness-Aware Minimization: Provable
  Guarantees
On Statistical Properties of Sharpness-Aware Minimization: Provable Guarantees
Kayhan Behdin
Rahul Mazumder
38
6
0
23 Feb 2023
Learning to Generalize Provably in Learning to Optimize
Learning to Generalize Provably in Learning to Optimize
Junjie Yang
Tianlong Chen
Mingkang Zhu
Fengxiang He
Dacheng Tao
Yitao Liang
Zhangyang Wang
34
6
0
22 Feb 2023
FedSpeed: Larger Local Interval, Less Communication Round, and Higher
  Generalization Accuracy
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy
Yan Sun
Li Shen
Tiansheng Huang
Liang Ding
Dacheng Tao
FedML
36
51
0
21 Feb 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
18
7
0
19 Feb 2023
Why is parameter averaging beneficial in SGD? An objective smoothing
  perspective
Why is parameter averaging beneficial in SGD? An objective smoothing perspective
Atsushi Nitanda
Ryuhei Kikuchi
Shugo Maeda
Denny Wu
FedML
25
0
0
18 Feb 2023
SAM operates far from home: eigenvalue regularization as a dynamical
  phenomenon
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon
Atish Agarwala
Yann N. Dauphin
21
20
0
17 Feb 2023
Previous
123...101112...161718
Next