Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 362 papers shown
Title
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
20
3
0
18 May 2023
SRIL: Selective Regularization for Class-Incremental Learning
Jisu Han
Jaemin Na
Wonjun Hwang
CLL
10
0
0
09 May 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
83
31
0
28 Apr 2023
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for Blood Clot Origin Identification
Koushik Sivarama Krishnan
P. J. J. Nikesh
Swathi Gnanasekar
Karthik Sivarama Krishnan
24
0
0
26 Apr 2023
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
21
16
0
13 Apr 2023
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging
Aarne Talman
H. Çelikkanat
Sami Virpioja
Markus Heinonen
Jörg Tiedemann
BDL
UQCV
26
7
0
10 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability
Martin Gubri
Maxime Cordy
Yves Le Traon
AAML
20
3
1
05 Apr 2023
ERM++: An Improved Baseline for Domain Generalization
Piotr Teterwak
Kuniaki Saito
Theodoros Tsiligkaridis
Kate Saenko
Bryan A. Plummer
OOD
41
9
0
04 Apr 2023
Randomized Adversarial Style Perturbations for Domain Generalization
Taehoon Kim
Bohyung Han
AAML
35
2
0
04 Apr 2023
Improving Fast Adversarial Training with Prior-Guided Knowledge
Xiaojun Jia
Yong Zhang
Xingxing Wei
Baoyuan Wu
Ke Ma
Jue Wang
Xiaochun Cao
AAML
34
26
0
01 Apr 2023
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Mingli Song
Mingli Song
28
5
0
26 Mar 2023
Fairness meets Cross-Domain Learning: a new perspective on Models and Metrics
Leonardo Iurada
S. Bucci
Timothy M. Hospedales
Tatiana Tommasi
27
0
0
25 Mar 2023
Generalist: Decoupling Natural and Robust Generalization
Hongjun Wang
Yisen Wang
OOD
AAML
49
14
0
24 Mar 2023
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
44
34
0
19 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
28
56
0
16 Mar 2023
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
28
22
0
14 Mar 2023
TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation
David Berthelot
Arnaud Autef
Jierui Lin
Dian Ang Yap
Shuangfei Zhai
Siyuan Hu
Daniel Zheng
Walter Talbot
Eric Gu
DiffM
31
80
0
07 Mar 2023
Rethinking Confidence Calibration for Failure Prediction
Fei Zhu
Zhen Cheng
Xu-Yao Zhang
Cheng-Lin Liu
UQCV
22
39
0
06 Mar 2023
DSD
2
^2
2
: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
32
7
0
02 Mar 2023
Average of Pruning: Improving Performance and Stability of Out-of-Distribution Detection
Zhen Cheng
Fei Zhu
Xu-Yao Zhang
Cheng-Lin Liu
MoMe
OODD
40
11
0
02 Mar 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain
Sravanti Addepalli
P. Sahu
Priyam Dey
R. Venkatesh Babu
MoMe
OOD
43
20
0
28 Feb 2023
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang-Shu Liu
Yinpeng Dong
Wenzhao Xiang
X. Yang
Hang Su
Junyi Zhu
YueFeng Chen
Yuan He
H. Xue
Shibao Zheng
OOD
VLM
AAML
33
72
0
28 Feb 2023
Analyzing Populations of Neural Networks via Dynamical Model Embedding
Jordan S. Cotler
Kai Sheng Tai
Felipe Hernández
Blake Elias
David Sussillo
17
4
0
27 Feb 2023
Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning
Van Tuan Tran
Huy Hieu Pham
Kok-Seng Wong
FedML
36
7
0
22 Feb 2023
Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks
Mohamed Aziz Bhouri
M. Joly
Robert Yu
S. Sarkar
P. Perdikaris
BDL
UQCV
AI4CE
19
1
0
14 Feb 2023
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
19
55
0
14 Feb 2023
Contour-based Interactive Segmentation
Danil Galeev
Polina Popenova
Anna Vorontsova
Anton Konushin
38
5
0
13 Feb 2023
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
Qizhang Li
Yiwen Guo
W. Zuo
Hao Chen
AAML
31
35
0
10 Feb 2023
Better Diffusion Models Further Improve Adversarial Training
Zekai Wang
Tianyu Pang
Chao Du
Min-Bin Lin
Weiwei Liu
Shuicheng Yan
DiffM
24
208
0
09 Feb 2023
Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion
Haotian Ju
Dongyue Li
Aneesh Sharma
Hongyang R. Zhang
31
40
0
09 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViT
MedIm
AI4TS
AI4CE
50
10
0
01 Feb 2023
Cross-Architectural Positive Pairs improve the effectiveness of Self-Supervised Learning
P. Singh
Jacopo Cirrone
SSL
40
0
0
27 Jan 2023
Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Woojin Lee
Jaewook Lee
20
9
0
27 Jan 2023
Model soups to increase inference without increasing compute time
Charles Dansereau
Milo Sobral
Maninder Bhogal
Mehdi Zalai
16
2
0
24 Jan 2023
Stability Analysis of Sharpness-Aware Minimization
Hoki Kim
Jinseong Park
Yujin Choi
Jaewook Lee
39
12
0
16 Jan 2023
Training trajectories, mini-batch losses and the curious role of the learning rate
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Nolan Miller
ODL
28
10
0
05 Jan 2023
Do Bayesian Variational Autoencoders Know What They Don't Know?
Misha Glazunov
Apostolis Zarras
UQCV
BDL
28
5
0
29 Dec 2022
Training Integer-Only Deep Recurrent Neural Networks
V. Nia
Eyyub Sari
Vanessa Courville
M. Asgharian
MQ
53
2
0
22 Dec 2022
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
26
1
0
21 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
24
214
0
19 Dec 2022
A Probabilistic Framework for Lifelong Test-Time Adaptation
Dhanajit Brahma
Piyush Rai
TTA
24
34
0
19 Dec 2022
The Underlying Correlated Dynamics in Neural Training
Rotem Turjeman
Tom Berkov
I. Cohen
Guy Gilboa
27
3
0
18 Dec 2022
Bayesian posterior approximation with stochastic ensembles
Oleksandr Balabanov
Bernhard Mehlig
H. Linander
BDL
UQCV
27
5
0
15 Dec 2022
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
Ashwinee Panda
Xinyu Tang
Saeed Mahloujifar
Vikash Sehwag
Prateek Mittal
43
11
0
08 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
72
435
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
28
52
0
02 Dec 2022
BARTSmiles: Generative Masked Language Models for Molecular Representations
Gayane Chilingaryan
Hovhannes Tamoyan
Ani Tevosyan
N. Babayan
L. Khondkaryan
Karen Hambardzumyan
Zaven Navoyan
Hrant Khachatrian
Armen Aghajanyan
SSL
35
25
0
29 Nov 2022
Cross-Domain Ensemble Distillation for Domain Generalization
Kyung-Jin Lee
Sungyeon Kim
Suha Kwak
FedML
OOD
26
38
0
25 Nov 2022
Indian Commercial Truck License Plate Detection and Recognition for Weighbridge Automation
Siddharth Agrawal
Keyur D. Joshi
35
4
0
23 Nov 2022
Previous
1
2
3
4
5
6
7
8
Next