Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 362 papers shown
Title
Generalization Measures for Zero-Shot Cross-Lingual Transfer
Saksham Bassi
Duygu Ataman
Kyunghyun Cho
29
0
0
24 Apr 2024
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models
Nastaran Saadati
Minh Pham
Nasla Saleem
Joshua R. Waite
Aditya Balu
Zhanhong Jiang
Chinmay Hegde
Soumik Sarkar
MoMe
47
1
0
11 Apr 2024
Flatness Improves Backbone Generalisation in Few-shot Classification
Rui Li
Martin Trapp
Marcus Klasson
Arno Solin
45
0
0
11 Apr 2024
Investigation of Energy-efficient AI Model Architectures and Compression Techniques for "Green" Fetal Brain Segmentation
Szymon Mazurek
M. Pytlarz
Sylwia Malec
A. Crimi
33
0
0
03 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
Yu Wang
MoMe
57
4
0
02 Apr 2024
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Bani Mallick
UQCV
39
0
0
29 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
90
80
0
20 Mar 2024
HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning
Gyudong Kim
Mehdi Ghasemi
Soroush Heidari
Seungryong Kim
Young Geun Kim
S. Vrudhula
Carole-Jean Wu
34
1
0
07 Mar 2024
Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Fei Zhu
Xu-Yao Zhang
Zhen Cheng
Cheng-Lin Liu
UQCV
52
10
0
05 Mar 2024
Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation
Yidong Zhao
João Tourais
Iain Pierce
Christian Nitsche
T. Treibel
Sebastian Weingartner
Artur M. Schweidtmann
Qian Tao
BDL
UQCV
43
5
0
04 Mar 2024
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
46
1
0
01 Mar 2024
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Bo Yang
Hengwei Zhang
Jin-dong Wang
Yulong Yang
Chenhao Lin
Chao Shen
Zhengyu Zhao
SILM
AAML
71
2
0
27 Feb 2024
Effective Gradient Sample Size via Variation Estimation for Accelerating Sharpness aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
Tian Wang
48
1
0
24 Feb 2024
RADIN: Souping on a Budget
Thibaut Menes
Olivier Risser-Maroix
MoMe
49
1
0
31 Jan 2024
Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement
Aaqib Saeed
Dimitris Spathis
Jungwoo Oh
Edward Choi
Ali Etemad
NoLa
31
2
0
25 Jan 2024
Doubly Perturbed Task Free Continual Learning
Byung Hyun Lee
Min-hwan Oh
Se Young Chun
27
3
0
20 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
62
1
0
19 Dec 2023
Open Domain Generalization with a Single Network by Regularization Exploiting Pre-trained Features
Inseop Chung
Kiyoon Yoo
Nojun Kwak
VLM
16
0
0
08 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
42
157
0
05 Dec 2023
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction
Cheng Sun
Wei-En Tai
Yu-Lin Shih
Kuan-Wei Chen
Yong-Jing Syu
Kent Selwyn The
Yu-Chiang Frank Wang
Hwann-Tzong Chen
3DV
38
2
0
30 Nov 2023
Efficient Stitchable Task Adaptation
Haoyu He
Zizheng Pan
Jing Liu
Jianfei Cai
Bohan Zhuang
34
3
0
29 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
44
1
0
29 Nov 2023
Parameter Exchange for Robust Dynamic Domain Generalization
Luojun Lin
Zhifeng Shen
Zhishu Sun
Yuanlong Yu
Lei Zhang
Weijie Chen
OOD
30
6
0
23 Nov 2023
Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization
Alexandra Chronopoulou
Jonas Pfeiffer
Joshua Maynez
Xinyi Wang
Sebastian Ruder
Priyanka Agrawal
MoMe
26
16
0
15 Nov 2023
Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective
Yifei Wang
Liangchen Li
Jiansheng Yang
Zhouchen Lin
Yisen Wang
31
11
0
30 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching
Nico Daheim
Thomas Möllenhoff
E. Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
MoMe
FedML
32
44
0
19 Oct 2023
Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data
Mouad El Bouchattaoui
Myriam Tami
Benoit Lepetit
P. Cournède
CML
OOD
71
2
0
16 Oct 2023
On the Over-Memorization During Natural, Robust and Catastrophic Overfitting
Runqi Lin
Chaojian Yu
Bo Han
Tongliang Liu
35
7
0
13 Oct 2023
Chunking: Continual Learning is not just about Distribution Shift
Thomas L. Lee
Amos Storkey
25
1
0
03 Oct 2023
Weight Averaging Improves Knowledge Distillation under Domain Shift
Valeriy Berezovskiy
Nikita Morozov
MoMe
31
1
0
20 Sep 2023
Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices
Elizaveta Kostenok
D. Cherniavskii
Alexey Zaytsev
56
5
0
22 Aug 2023
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
41
16
0
21 Aug 2023
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ Segmentation
Jadie Adams
Shireen Y. Elhabian
UQCV
21
5
0
15 Aug 2023
Channel-Wise Contrastive Learning for Learning with Noisy Labels
Hui-Sung Kang
Sheng Liu
Huaxi Huang
Tongliang Liu
NoLa
42
0
0
14 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
45
1
0
31 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
30
23
0
29 Jul 2023
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Minghui Chen
Meirui Jiang
Qianming Dou
Zehua Wang
Xiaoxiao Li
FedML
35
16
0
20 Jul 2023
Towards Building More Robust Models with Frequency Bias
Qingwen Bu
Dong Huang
Heming Cui
AAML
17
10
0
19 Jul 2023
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedML
FAtt
MoMe
33
15
0
13 Jul 2023
Concurrent ischemic lesion age estimation and segmentation of CT brain using a Transformer-based network
A. Marcus
P. Bentley
Daniel Rueckert
MedIm
21
9
0
21 Jun 2023
Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts
Annie S. Chen
Yoonho Lee
Amrith Rajagopal Setlur
Sergey Levine
Chelsea Finn
OOD
19
5
0
19 Jun 2023
A Boosted Model Ensembling Approach to Ball Action Spotting in Videos: The Runner-Up Solution to CVPR'23 SoccerNet Challenge
Luping Wang
Hao Guo
B. Liu
35
3
0
09 Jun 2023
Quantifying Representation Reliability in Self-Supervised Learning Models
Young-Jin Park
Hao Wang
Shervin Ardeshir
Navid Azizan
SSL
UQCV
34
3
0
31 May 2023
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
Jan van Gemert
25
9
0
31 May 2023
HyperTime: Hyperparameter Optimization for Combating Temporal Distribution Shifts
Shaokun Zhang
Yiran Wu
Zhonghua Zheng
Qingyun Wu
Chi Wang
OOD
51
7
0
28 May 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
34
6
0
25 May 2023
Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning
Moonseok Choi
Hyungi Lee
G. Nam
Juho Lee
37
2
0
24 May 2023
Improving Convergence and Generalization Using Parameter Symmetries
Bo Zhao
Robert Mansel Gower
Robin Walters
Rose Yu
MoMe
33
13
0
22 May 2023
POEM: Polarization of Embeddings for Domain-Invariant Representations
Sang-Yeong Jo
Sung Whan Yoon
19
8
0
22 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
51
110
0
22 May 2023
Previous
1
2
3
4
5
6
7
8
Next