ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05407
  4. Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
v1v2v3 (latest)

Averaging Weights Leads to Wider Optima and Better Generalization

14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
    FedMLMoMe
ArXiv (abs)PDFHTML

Papers citing "Averaging Weights Leads to Wider Optima and Better Generalization"

50 / 1,040 papers shown
Title
A Closer Look at Smoothness in Domain Adversarial Training
A Closer Look at Smoothness in Domain Adversarial Training
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
Arihant Jain
R. Venkatesh Babu
118
122
0
16 Jun 2022
Federated Learning with Uncertainty via Distilled Predictive
  Distributions
Federated Learning with Uncertainty via Distilled Predictive Distributions
Shreyansh P. Bhatt
Aishwarya Gupta
Piyush Rai
FedML
63
11
0
15 Jun 2022
Bayesian Learning of Parameterised Quantum Circuits
Bayesian Learning of Parameterised Quantum Circuits
Samuel Duffield
Marcello Benedetti
Matthias Rosenkranz
60
11
0
15 Jun 2022
Label Matching Semi-Supervised Object Detection
Label Matching Semi-Supervised Object Detection
Binbin Chen
Weijie Chen
Shicai Yang
Yunyi Xuan
Mingli Song
Di Xie
Shiliang Pu
Mingli Song
Yueting Zhuang
92
76
0
14 Jun 2022
Geometrically Guided Integrated Gradients
Geometrically Guided Integrated Gradients
Md. Mahfuzur Rahman
N. Lewis
Sergey Plis
FAttAAML
30
0
0
13 Jun 2022
Density Regression and Uncertainty Quantification with Bayesian Deep
  Noise Neural Networks
Density Regression and Uncertainty Quantification with Bayesian Deep Noise Neural Networks
Daiwei Zhang
Tianci Liu
Jian Kang
BDLUQCV
71
3
0
12 Jun 2022
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Fisher SAM: Information Geometry and Sharpness Aware Minimisation
Minyoung Kim
Da Li
S. Hu
Timothy M. Hospedales
AAML
92
72
0
10 Jun 2022
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
CASS: Cross Architectural Self-Supervision for Medical Image Analysis
Pranav Singh
E. Sizikova
Jacopo Cirrone
OOD
173
8
0
08 Jun 2022
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
Momin Abbas
Quan-Wu Xiao
Lisha Chen
Pin-Yu Chen
Tianyi Chen
111
84
0
08 Jun 2022
Improving Adversarial Robustness by Putting More Regularizations on Less
  Robust Samples
Improving Adversarial Robustness by Putting More Regularizations on Less Robust Samples
Dongyoon Yang
Insung Kong
Yongdai Kim
OODAAML
80
10
0
07 Jun 2022
FlexLip: A Controllable Text-to-Lip System
FlexLip: A Controllable Text-to-Lip System
Dan Oneaţă
Beáta Lőrincz
Adriana Stan
H. Cucu
55
3
0
07 Jun 2022
DiMS: Distilling Multiple Steps of Iterative Non-Autoregressive
  Transformers for Machine Translation
DiMS: Distilling Multiple Steps of Iterative Non-Autoregressive Transformers for Machine Translation
Sajad Norouzi
Rasa Hosseinzadeh
Felipe Pérez
M. Volkovs
44
2
0
07 Jun 2022
Learning Dynamics and Generalization in Reinforcement Learning
Learning Dynamics and Generalization in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
Marta Z. Kwiatkowska
Y. Gal
OODOffRL
77
13
0
05 Jun 2022
Beyond accuracy: generalization properties of bio-plausible temporal
  credit assignment rules
Beyond accuracy: generalization properties of bio-plausible temporal credit assignment rules
Yuhan Helena Liu
Arna Ghosh
Blake A. Richards
E. Shea-Brown
Guillaume Lajoie
94
10
0
02 Jun 2022
Differentiable programming for functional connectomics
Differentiable programming for functional connectomics
R. Ciric
A. Thomas
Oscar Esteban
R. Poldrack
64
0
0
31 May 2022
Superposing Many Tickets into One: A Performance Booster for Sparse
  Neural Network Training
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
Decebal Constantin Mocanu
Shiwei Liu
110
8
0
30 May 2022
Failure Detection in Medical Image Classification: A Reality Check and
  Benchmarking Testbed
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking Testbed
Mélanie Bernhardt
Fabio De Sousa Ribeiro
Ben Glocker
56
10
0
27 May 2022
Trainable Weight Averaging: Accelerating Training and Improving Generalization
Trainable Weight Averaging: Accelerating Training and Improving Generalization
Tao Li
Zhehao Huang
Yingwen Wu
Zhengbao He
Qinghua Tao
Xiaolin Huang
Chih-Jen Lin
MoMe
108
3
0
26 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
114
20
0
25 May 2022
Alleviating Robust Overfitting of Adversarial Training With Consistency
  Regularization
Alleviating Robust Overfitting of Adversarial Training With Consistency Regularization
Shudong Zhang
Haichang Gao
Tianwei Zhang
Yunyi Zhou
Zihui Wu
AAML
74
4
0
24 May 2022
Training Efficient CNNS: Tweaking the Nuts and Bolts of Neural Networks
  for Lighter, Faster and Robust Models
Training Efficient CNNS: Tweaking the Nuts and Bolts of Neural Networks for Lighter, Faster and Robust Models
Sabeesh Ethiraj
B. Bolla
97
2
0
23 May 2022
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative
  Priors
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors
Ravid Shwartz-Ziv
Micah Goldblum
Hossein Souri
Sanyam Kapoor
Chen Zhu
Yann LeCun
A. Wilson
UQCVBDL
128
43
0
20 May 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
258
138
0
19 May 2022
Pretraining Approaches for Spoken Language Recognition: TalTech
  Submission to the OLR 2021 Challenge
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge
Tanel Alumäe
Kunnar Kukk
44
6
0
14 May 2022
NeuralEF: Deconstructing Kernels by Deep Neural Networks
NeuralEF: Deconstructing Kernels by Deep Neural Networks
Zhijie Deng
Jiaxin Shi
Jun Zhu
134
19
0
30 Apr 2022
Reducing Predictive Feature Suppression in Resource-Constrained
  Contrastive Image-Caption Retrieval
Reducing Predictive Feature Suppression in Resource-Constrained Contrastive Image-Caption Retrieval
Maurits J. R. Bleeker
Andrew Yates
Maarten de Rijke
91
4
0
28 Apr 2022
Conformer and Blind Noisy Students for Improved Image Quality Assessment
Conformer and Blind Noisy Students for Improved Image Quality Assessment
Marcos V. Conde
Maxime Burchi
Radu Timofte
DiffM
91
14
0
27 Apr 2022
Meta-free few-shot learning via representation learning with weight
  averaging
Meta-free few-shot learning via representation learning with weight averaging
Kuilin Chen
Chi-Guhn Lee
56
5
0
26 Apr 2022
Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
Federico Lucchetti
Jérémie Decouchant
Maria Fernandes
L. Chen
Marcus Volp
FedML
53
0
0
23 Apr 2022
Towards a Deeper Understanding of Skeleton-based Gait Recognition
Towards a Deeper Understanding of Skeleton-based Gait Recognition
Torben Teepe
Johannes Gilg
Fabian Herzog
S. Hörmann
Gerhard Rigoll
CVBM
73
74
0
16 Apr 2022
A Simple Approach to Adversarial Robustness in Few-shot Image
  Classification
A Simple Approach to Adversarial Robustness in Few-shot Image Classification
Akshayvarun Subramanya
Hamed Pirsiavash
VLM
66
6
0
11 Apr 2022
The Two Dimensions of Worst-case Training and the Integrated Effect for
  Out-of-domain Generalization
The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization
Zeyi Huang
Haohan Wang
Dong Huang
Yong Jae Lee
Eric P. Xing
88
22
0
09 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
54
0
0
08 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
104
11
0
07 Apr 2022
Event Transformer. A sparse-aware solution for efficient event data
  processing
Event Transformer. A sparse-aware solution for efficient event data processing
Alberto Sabater
Luis Montesano
Ana C. Murillo
96
52
0
07 Apr 2022
FedCos: A Scene-adaptive Federated Optimization Enhancement for
  Performance Improvement
FedCos: A Scene-adaptive Federated Optimization Enhancement for Performance Improvement
Hao Zhang
Tingting Wu
Siyao Cheng
Jie Liu
FedML
57
12
0
07 Apr 2022
Differentially Private Sampling from Rashomon Sets, and the Universality
  of Langevin Diffusion for Convex Optimization
Differentially Private Sampling from Rashomon Sets, and the Universality of Langevin Diffusion for Convex Optimization
Arun Ganesh
Abhradeep Thakurta
Jalaj Upadhyay
89
1
0
04 Apr 2022
Omni-DETR: Omni-Supervised Object Detection with Transformers
Omni-DETR: Omni-Supervised Object Detection with Transformers
Pei Wang
Zhaowei Cai
Hao Yang
Gurumurthy Swaminathan
Nuno Vasconcelos
Bernt Schiele
Stefano Soatto
97
41
0
30 Mar 2022
AudioTagging Done Right: 2nd comparison of deep learning methods for
  environmental sound classification
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Billy Li
Shuhui Qu
Po-Yao (Bernie) Huang
Florian Metze
VLM
100
9
0
25 Mar 2022
Improving Generalization in Federated Learning by Seeking Flat Minima
Improving Generalization in Federated Learning by Seeking Flat Minima
Debora Caldarola
Barbara Caputo
Marco Ciccone
FedML
101
112
0
22 Mar 2022
Delving into the Estimation Shift of Batch Normalization in a Network
Delving into the Estimation Shift of Batch Normalization in a Network
Lei Huang
Yi Zhou
Tian Wang
Jie Luo
Xianglong Liu
BDL
138
20
0
21 Mar 2022
Closing the Generalization Gap of Cross-silo Federated Medical Image
  Segmentation
Closing the Generalization Gap of Cross-silo Federated Medical Image Segmentation
An Xu
Wenqi Li
Pengfei Guo
Dong Yang
H. Roth
Ali Hatamizadeh
Can Zhao
Daguang Xu
Heng-Chiao Huang
Ziyue Xu
FedML
86
53
0
18 Mar 2022
Surrogate Gap Minimization Improves Sharpness-Aware Training
Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang
Boqing Gong
Liangzhe Yuan
Huayu Chen
Hartwig Adam
Nicha Dvornek
S. Tatikonda
James Duncan
Ting Liu
105
157
0
15 Mar 2022
On the benefits of knowledge distillation for adversarial robustness
On the benefits of knowledge distillation for adversarial robustness
Javier Maroto
Guillermo Ortiz-Jiménez
P. Frossard
AAMLFedML
74
20
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio
  Classification
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
73
29
0
13 Mar 2022
Detection of multiple retinal diseases in ultra-widefield fundus images
  using deep learning: data-driven identification of relevant regions
Detection of multiple retinal diseases in ultra-widefield fundus images using deep learning: data-driven identification of relevant regions
Justin Engelmann
Alice D. McTrusty
Ian J. C. MacCormick
Emma Pead
Amos Storkey
Miguel O. Bernabeu
MedIm
44
1
0
11 Mar 2022
Flexible Amortized Variational Inference in qBOLD MRI
Flexible Amortized Variational Inference in qBOLD MRI
Ivor J. A. Simpson
Ashley McManamon
Balázs Örzsik
A. Stone
N. Blockley
Iris Asllani
A. Colasanti
M. Cercignani
26
0
0
11 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit
  Post-Training Quantization
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQVLM
100
178
0
11 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
201
1,013
1
10 Mar 2022
Evolutionary Neural Cascade Search across Supernetworks
Evolutionary Neural Cascade Search across Supernetworks
A. Chebykin
Tanja Alderliesten
Peter A. N. Bosman
48
1
0
08 Mar 2022
Previous
123...131415...192021
Next