Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
v1
v2
v3 (latest)
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 1,040 papers shown
Title
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
67
3
0
18 May 2023
Sharpness & Shift-Aware Self-Supervised Learning
Ngoc N. Tran
S. Duong
Hoang Phan
Tung Pham
Dinh Q. Phung
Trung Le
SSL
71
1
0
17 May 2023
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models
Shangbin Feng
Weijia Shi
Yuyang Bai
Vidhisha Balachandran
Tianxing He
Yulia Tsvetkov
KELM
133
37
0
17 May 2023
Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
Tailin Zhou
Zehong Lin
Jinchao Zhang
Danny H. K. Tsang
MoMe
FedML
100
12
0
13 May 2023
Continual Learning for End-to-End ASR by Averaging Domain Experts
Peter William VanHarn Plantinga
Jaekwon Yoo
C. Dhir
CLL
MoMe
45
1
0
12 May 2023
Sharpness-Aware Minimization Alone can Improve Adversarial Robustness
Zeming Wei
Jingyu Zhu
Yihao Zhang
AAML
89
11
0
09 May 2023
SRIL: Selective Regularization for Class-Incremental Learning
Jisu Han
Jaemin Na
Wonjun Hwang
CLL
141
0
0
09 May 2023
GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent
Sascha Marton
Stefan Lüdtke
Christian Bartelt
Heiner Stuckenschmidt
137
12
0
05 May 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
139
125
0
04 May 2023
Stimulative Training++: Go Beyond The Performance Limits of Residual Networks
XinYu Piao
Tong He
DoangJoo Synn
Baopu Li
Tao Chen
Lei Bai
Jong-Kook Kim
89
4
0
04 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level
V. Sydorskyi
Igor Krashenyi
Denis Savka
Oleksandr Zarichkovyi
41
1
0
03 May 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
130
34
0
28 Apr 2023
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for Blood Clot Origin Identification
Koushik Sivarama Krishnan
P. J. J. Nikesh
Swathi Gnanasekar
Karthik Sivarama Krishnan
48
0
0
26 Apr 2023
Sound-based drone fault classification using multitask learning
Wonjun Yi
Jung-Woo Choi
Jae-Woo Lee
122
5
0
23 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Yaoyu Zhang
Ruimao Zhang
Shuguang Cui
Zhuguo Li
59
5
0
23 Apr 2023
Decoupled Training for Long-Tailed Classification With Stochastic Representations
G. Nam
Sunguk Jang
Juho Lee
OOD
BDL
OODD
66
14
0
19 Apr 2023
OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images
Bingchen Zhao
Jiahao Wang
Wufei Ma
Artur Jesslen
Si-Jia Yang
Shaozuo Yu
O. Zendel
Christian Theobalt
Alan Yuille
Adam Kortylewski
95
9
0
17 Apr 2023
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization
Agustinus Kristiadi
Alexander Immer
Runa Eschenhagen
Vincent Fortuin
BDL
UQCV
80
10
0
17 Apr 2023
Cross-Entropy Loss Functions: Theoretical Analysis and Applications
Anqi Mao
M. Mohri
Yutao Zhong
AAML
123
328
0
14 Apr 2023
Deep neural networks have an inbuilt Occam's razor
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
62
16
0
13 Apr 2023
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging
Aarne Talman
H. Çelikkanat
Sami Virpioja
Markus Heinonen
Jörg Tiedemann
BDL
UQCV
84
8
0
10 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
105
43
0
07 Apr 2023
PopulAtion Parameter Averaging (PAPA)
Alexia Jolicoeur-Martineau
Emy Gervais
Kilian Fatras
Yan Zhang
Simon Lacoste-Julien
MoMe
110
21
0
06 Apr 2023
Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability
Martin Gubri
Maxime Cordy
Yves Le Traon
AAML
92
3
1
05 Apr 2023
ERM++: An Improved Baseline for Domain Generalization
Piotr Teterwak
Kuniaki Saito
Theodoros Tsiligkaridis
Kate Saenko
Bryan A. Plummer
OOD
93
10
0
04 Apr 2023
Randomized Adversarial Style Perturbations for Domain Generalization
Taehoon Kim
Bohyung Han
AAML
87
2
0
04 Apr 2023
Improving Fast Adversarial Training with Prior-Guided Knowledge
Xiaojun Jia
Yong Zhang
Xingxing Wei
Baoyuan Wu
Ke Ma
Jue Wang
Xiaochun Cao
AAML
98
32
0
01 Apr 2023
Whether and When does Endoscopy Domain Pretraining Make Sense?
Dominik Batić
Felix Holm
Ege Özsoy
Tobias Czempiel
Nassir Navab
35
7
0
30 Mar 2023
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Mingli Song
Mingli Song
61
6
0
26 Mar 2023
CFA: Class-wise Calibrated Fair Adversarial Training
Zeming Wei
Yifei Wang
Yiwen Guo
Yisen Wang
AAML
104
54
0
25 Mar 2023
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics
Leonardo Iurada
S. Bucci
Timothy M. Hospedales
Tatiana Tommasi
72
0
0
25 Mar 2023
Generalist: Decoupling Natural and Robust Generalization
Hongjun Wang
Yisen Wang
OOD
AAML
97
14
0
24 Mar 2023
A Survey of Historical Learning: Learning Models with Learning History
Xiang Li
Ge Wu
Lingfeng Yang
Wenzhe Wang
Renjie Song
Jian Yang
MU
AI4TS
103
2
0
23 Mar 2023
Revisiting the Fragility of Influence Functions
Jacob R. Epifano
Ravichandran Ramachandran
A. Masino
Ghulam Rasool
TDI
75
14
0
22 Mar 2023
Semantic segmentation of surgical hyperspectral images under geometric domain shifts
Jan Sellner
Silvia Seidlitz
Alexander Studier-Fischer
Alessandro Motta
Berkin Özdemir
Beat Peter Müller-Stich
Felix Nickel
Lena Maier-Hein
29
7
0
20 Mar 2023
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
111
37
0
19 Mar 2023
Sharpness-Aware Gradient Matching for Domain Generalization
Pengfei Wang
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
73
95
0
18 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
111
70
0
16 Mar 2023
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
83
24
0
14 Mar 2023
Domain Generalization in Machine Learning Models for Wireless Communications: Concepts, State-of-the-Art, and Open Issues
Mohamed Akrout
Amal Feriani
F. Bellili
A. Mezghani
Ekram Hossain
OOD
AI4CE
101
28
0
13 Mar 2023
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Zangwei Zheng
Mingyu Ma
Kai Wang
Ziheng Qin
Xiangyu Yue
Yang You
CLL
VLM
164
79
0
12 Mar 2023
Loss-Curvature Matching for Dataset Selection and Condensation
Seung-Jae Shin
Heesun Bae
DongHyeok Shin
Weonyoung Joo
Il-Chul Moon
DD
96
27
0
08 Mar 2023
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
David Berthelot
Arnaud Autef
Jierui Lin
Dian Ang Yap
Shuangfei Zhai
Siyuan Hu
Daniel Zheng
Walter Talbot
Eric Gu
DiffM
101
94
0
07 Mar 2023
To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning
Ildus Sadrtdinov
Dmitrii Pozdeev
Dmitry Vetrov
E. Lobacheva
93
6
0
06 Mar 2023
Rethinking Confidence Calibration for Failure Prediction
Fei Zhu
Zhen Cheng
Xu-Yao Zhang
Cheng-Lin Liu
UQCV
91
41
0
06 Mar 2023
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Hao Zou
Peng Cui
77
41
0
03 Mar 2023
DSD
2
^2
2
: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
89
7
0
02 Mar 2023
Average of Pruning: Improving Performance and Stability of Out-of-Distribution Detection
Zhen Cheng
Fei Zhu
Xu-Yao Zhang
Cheng-Lin Liu
MoMe
OODD
93
12
0
02 Mar 2023
Domain-aware Triplet loss in Domain Generalization
Kai Guo
Brian C. Lovell
OOD
84
7
0
01 Mar 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain
Sravanti Addepalli
P. Sahu
Priyam Dey
R. Venkatesh Babu
MoMe
OOD
118
20
0
28 Feb 2023
Previous
1
2
3
...
9
10
11
...
19
20
21
Next