Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.01412
Cited By
Sharpness-Aware Minimization for Efficiently Improving Generalization
3 October 2020
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sharpness-Aware Minimization for Efficiently Improving Generalization"
50 / 867 papers shown
Title
Investigating the Histogram Loss in Regression
Ehsan Imani
Kai Luedemann
Sam Scholnick-Hughes
Esraa Elelimy
Martha White
UQCV
34
5
0
20 Feb 2024
Predicting Maximum Permitted Process Forces for Object Grasping and Manipulation Using a Deep Learning Regression Model
S. Wucherer
R. McMurray
K. Y. Ng
F. Kerber
28
1
0
18 Feb 2024
Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima
Shan Zhong
Zhongzhan Huang
Daifeng Li
Wushao Wen
Jinghui Qin
Liang Lin
22
12
0
17 Feb 2024
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Romain Ilbert
Ambroise Odonnat
Vasilii Feofanov
Aladin Virmaux
Giuseppe Paolo
Themis Palpanas
I. Redko
AI4TS
49
22
0
15 Feb 2024
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
41
25
0
15 Feb 2024
Criterion Collapse and Loss Distribution Control
Matthew J. Holland
28
2
0
15 Feb 2024
Medical Image Segmentation with InTEnt: Integrated Entropy Weighting for Single Image Test-Time Adaptation
Haoyu Dong
N. Konz
Han Gu
Maciej Mazurowski
OOD
26
4
0
14 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
A PAC-Bayesian Link Between Generalisation and Flat Minima
Maxime Haddouche
Paul Viallard
Umut Simsekli
Benjamin Guedj
43
3
0
13 Feb 2024
Cacophony: An Improved Contrastive Audio-Text Model
Ge Zhu
Jordan Darefsky
Zhiyao Duan
AuLLM
46
11
0
10 Feb 2024
Optimizing for ROC Curves on Class-Imbalanced Data by Training over a Family of Loss Functions
Kelsey Lieberman
Shuai Yuan
Swarna Kamlam Ravindran
Carlo Tomasi
26
0
0
08 Feb 2024
Tradeoffs of Diagonal Fisher Information Matrix Estimators
Alexander Soen
Ke Sun
21
1
0
08 Feb 2024
Intersectional Two-sided Fairness in Recommendation
Yifan Wang
Peijie Sun
Weizhi Ma
Min Zhang
Yuan Zhang
Peng Jiang
Shaoping Ma
FaML
31
8
0
05 Feb 2024
Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection
Chao Chen
Zhihang Fu
Kai-Chun Liu
Ze Chen
Mingyuan Tao
Jieping Ye
OODD
33
3
0
04 Feb 2024
Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion
Zexi Li
Zhiqi Li
Jie Lin
Tao Shen
Tao Lin
Chao Wu
41
4
0
02 Feb 2024
Vaccine: Perturbation-aware Alignment for Large Language Model
Tiansheng Huang
Sihao Hu
Ling Liu
50
33
0
02 Feb 2024
Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition
Dichao Liu
21
0
0
25 Jan 2024
LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection
Dat Nguyen
Nesryne Mejri
I. Singh
Polina Kuleshova
Marcella Astrid
Anis Kacem
Enjie Ghorbel
Djamila Aouada
30
25
0
24 Jan 2024
Catch-Up Mix: Catch-Up Class for Struggling Filters in CNN
Minsoo Kang
Minkoo Kang
Suhyun Kim
19
3
0
24 Jan 2024
A Precise Characterization of SGD Stability Using Loss Surface Geometry
Gregory Dexter
Borja Ocejo
S. Keerthi
Aman Gupta
Ayan Acharya
Rajiv Khanna
MLT
30
0
0
22 Jan 2024
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker
Frederick Altrock
Benjamin Risse
79
5
0
22 Jan 2024
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
46
7
0
19 Jan 2024
MADA: Meta-Adaptive Optimizers through hyper-gradient Descent
Kaan Ozkara
Can Karakus
Parameswaran Raman
Mingyi Hong
Shoham Sabach
B. Kveton
V. Cevher
27
2
0
17 Jan 2024
Bag of Tricks to Boost Adversarial Transferability
Zeliang Zhang
Rongyi Zhu
Wei Yao
Xiaosen Wang
Chenliang Xu
AAML
47
9
0
16 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
34
1
0
14 Jan 2024
EsaCL: Efficient Continual Learning of Sparse Models
Weijieying Ren
V. Honavar
CLL
25
3
0
11 Jan 2024
VLLaVO: Mitigating Visual Gap through LLMs
Shuhao Chen
Yulong Zhang
Weisen Jiang
Jiangang Lu
Yu Zhang
VLM
54
2
0
06 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
Min-Kook Suh
Seung-Woo Seo
ODL
29
0
0
06 Jan 2024
Plug-and-Play Transformer Modules for Test-Time Adaptation
Xiangyu Chang
Sk. Miraj Ahmed
S. Krishnamurthy
Başak Güler
A. Swami
Samet Oymak
A. Roy-Chowdhury
33
0
0
06 Jan 2024
Calibration Attacks: A Comprehensive Study of Adversarial Attacks on Model Confidence
Stephen Obadinma
Xiaodan Zhu
Hongyu Guo
AAML
14
1
0
05 Jan 2024
Universal Pyramid Adversarial Training for Improved ViT Performance
Ping Yeh-Chiang
Yipin Zhou
Omid Poursaeed
S. Narayan
Shukla
Tom Goldstein
Ser-Nam Lim
AAML
ViT
16
0
0
26 Dec 2023
CR-SAM: Curvature Regularized Sharpness-Aware Minimization
Tao Wu
Tie Luo
D. C. Wunsch
18
3
0
21 Dec 2023
LRS: Enhancing Adversarial Transferability through Lipschitz Regularized Surrogate
Tao Wu
Tie Luo
D. C. Wunsch
41
4
0
20 Dec 2023
Doubly Perturbed Task Free Continual Learning
Byung Hyun Lee
Min-hwan Oh
Se Young Chun
27
3
0
20 Dec 2023
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
Weixi Song
Z. Li
Lefei Zhang
Hai Zhao
Bo Du
VLM
23
7
0
19 Dec 2023
Weighted Ensemble Models Are Strong Continual Learners
Imad Eddine Marouf
Subhankar Roy
Enzo Tartaglione
Stéphane Lathuilière
CLL
32
16
0
14 Dec 2023
Generalized Deepfakes Detection with Reconstructed-Blended Images and Multi-scale Feature Reconstruction Network
Yuyang Sun
H. Nguyen
Chun-Shien Lu
ZhiYong Zhang
Lu Sun
Isao Echizen
CVBM
45
2
0
13 Dec 2023
Investigation into the Training Dynamics of Learned Optimizers
Jan Sobotka
Petr Simánek
Daniel Vasata
28
0
0
12 Dec 2023
ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment
Paniz Halvachi
Alexandra Peste
Dan Alistarh
Christoph H. Lampert
25
0
0
11 Dec 2023
AUGCAL: Improving Sim2Real Adaptation by Uncertainty Calibration on Augmented Synthetic Images
Prithvijit Chattopadhyay
Bharat Goyal
B. Ecsedi
Viraj Prabhu
Judy Hoffman
49
0
0
11 Dec 2023
Cross Domain Generative Augmentation: Domain Generalization with Latent Diffusion Models
S. Hemati
Mahdi Beitollahi
A. Estiri
Bassel Al Omari
Xi Chen
Guojun Zhang
19
6
0
08 Dec 2023
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
Jaehyung Kim
Yuning Mao
Rui Hou
Hanchao Yu
Davis Liang
Pascale Fung
Qifan Wang
Fuli Feng
Lifu Huang
Madian Khabsa
AAML
28
2
0
07 Dec 2023
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives
Pierre Wolinski
ODL
29
0
0
06 Dec 2023
f-FERM: A Scalable Framework for Robust Fair Empirical Risk Minimization
Sina Baharlouei
Shivam Patel
Meisam Razaviyayn
35
4
0
06 Dec 2023
Calibrated Adaptive Teacher for Domain Adaptive Intelligent Fault Diagnosis
Florent Forest
Olga Fink
25
0
0
05 Dec 2023
Simplifying Neural Network Training Under Class Imbalance
Ravid Shwartz-Ziv
Micah Goldblum
Yucen Lily Li
C. Bayan Bruss
Andrew Gordon Wilson
31
14
0
05 Dec 2023
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars
Kaiyue Wen
Yuchen Li
Bing Liu
Andrej Risteski
26
21
0
03 Dec 2023
On the Interplay Between Stepsize Tuning and Progressive Sharpening
Vincent Roulet
Atish Agarwala
Fabian Pedregosa
13
4
0
30 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
44
1
0
29 Nov 2023
Should We Learn Most Likely Functions or Parameters?
Shikai Qiu
Tim G. J. Rudner
Sanyam Kapoor
Andrew Gordon Wilson
13
5
0
27 Nov 2023
Previous
1
2
3
...
6
7
8
...
16
17
18
Next