ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05407
  4. Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
v1v2v3 (latest)

Averaging Weights Leads to Wider Optima and Better Generalization

14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
    FedMLMoMe
ArXiv (abs)PDFHTML

Papers citing "Averaging Weights Leads to Wider Optima and Better Generalization"

50 / 1,040 papers shown
Title
Zero- and Few-shot Sound Event Localization and Detection
Zero- and Few-shot Sound Event Localization and Detection
Kazuki Shimada
Kengo Uchida
Yuichiro Koyama
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
Tatsuya Kawahara
79
4
0
17 Sep 2023
Improve Deep Forest with Learnable Layerwise Augmentation Policy
  Schedule
Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule
Hongyu Zhu
Sichu Liang
Wentao Hu
Fangqi Li
Yali Yuan
Shi-Lin Wang
Guang Cheng
46
2
0
16 Sep 2023
A Distributed Data-Parallel PyTorch Implementation of the Distributed
  Shampoo Optimizer for Training Neural Networks At-Scale
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
98
27
0
12 Sep 2023
Exploring Flat Minima for Domain Generalization with Large Learning
  Rates
Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
81
3
0
12 Sep 2023
Hazards in Deep Learning Testing: Prevalence, Impact and Recommendations
Hazards in Deep Learning Testing: Prevalence, Impact and Recommendations
Salah Ghamizi
Maxime Cordy
Yuejun Guo
Mike Papadakis
And Yves Le Traon
35
1
0
11 Sep 2023
CoNeTTE: An efficient Audio Captioning system leveraging multiple
  datasets with Task Embedding
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Etienne Labbé
Thomas Pellegrini
J. Pinquier
81
14
0
01 Sep 2023
Neural Network Training Strategy to Enhance Anomaly Detection
  Performance: A Perspective on Reconstruction Loss Amplification
Neural Network Training Strategy to Enhance Anomaly Detection Performance: A Perspective on Reconstruction Loss Amplification
Yeonghyeon Park
Sungho Kang
Myung Jin Kim
Hyeonho Jeong
H. Park
Hyeong Seok Kim
Juneho Yi
69
4
0
28 Aug 2023
Do the Frankenstein, or how to achieve better out-of-distribution
  performance with manifold mixing model soup
Do the Frankenstein, or how to achieve better out-of-distribution performance with manifold mixing model soup
Hannes Fassold
MoMeUQCV
43
2
0
28 Aug 2023
A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized
  Semantic Segmentation
A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized Semantic Segmentation
Jan-Aike Termöhlen
Tim Bartels
Tim Fingscheidt
ViT
65
6
0
25 Aug 2023
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
Changxu Cheng
Peng Wang
Cheng Da
Qi Zheng
Cong Yao
96
15
0
24 Aug 2023
SayCanPay: Heuristic Planning with Large Language Models using Learnable
  Domain Knowledge
SayCanPay: Heuristic Planning with Large Language Models using Learnable Domain Knowledge
Rishi Hazra
Pedro Zuidberg Dos Martires
Luc de Raedt
LM&RoLLMAG
87
38
0
24 Aug 2023
FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in
  Federated Learning
FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning
Gihun Lee
Minchan Jeong
Sangmook Kim
Jaehoon Oh
Se-Young Yun
FedML
87
9
0
24 Aug 2023
Stabilizing RNN Gradients through Pre-training
Stabilizing RNN Gradients through Pre-training
Luca Herranz-Celotti
Jean Rouat
107
1
0
23 Aug 2023
Revisiting and Exploring Efficient Fast Adversarial Training via LAW:
  Lipschitz Regularization and Auto Weight Averaging
Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging
Xiaojun Jia
YueFeng Chen
Xiaofeng Mao
Ranjie Duan
Jindong Gu
Rong Zhang
H. Xue
Xiaochun Cao
AAML
62
11
0
22 Aug 2023
Uncertainty Estimation of Transformers' Predictions via Topological
  Analysis of the Attention Matrices
Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices
Elizaveta Kostenok
D. Cherniavskii
Alexey Zaytsev
83
6
0
22 Aug 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
99
17
0
21 Aug 2023
Efficient Representation Learning for Healthcare with
  Cross-Architectural Self-Supervision
Efficient Representation Learning for Healthcare with Cross-Architectural Self-Supervision
P. Singh
Jacopo Cirrone
OODSSL
85
2
0
19 Aug 2023
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ
  Segmentation
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ Segmentation
Jadie Adams
Shireen Y. Elhabian
UQCV
67
6
0
15 Aug 2023
Channel-Wise Contrastive Learning for Learning with Noisy Labels
Channel-Wise Contrastive Learning for Learning with Noisy Labels
Hui-Sung Kang
Sheng Liu
Huaxi Huang
Tongliang Liu
NoLa
85
0
0
14 Aug 2023
Experts Weights Averaging: A New General Training Scheme for Vision
  Transformers
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Yongqian Huang
Peng Ye
Xiaoshui Huang
Sheng Li
Tao Chen
Tong He
Wanli Ouyang
MoMe
84
9
0
11 Aug 2023
DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from
  Optical Satellite Images
DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images
Xuechao Zou
Keqin Li
Junliang Xing
Yu-an Zhang
Shiying Wang
Lei Jin
Pin Tao
DiffM
90
34
0
08 Aug 2023
Enhancing Adversarial Robustness in Low-Label Regime via Adaptively
  Weighted Regularization and Knowledge Distillation
Enhancing Adversarial Robustness in Low-Label Regime via Adaptively Weighted Regularization and Knowledge Distillation
Dongyoon Yang
Insung Kong
Yongdai Kim
74
4
0
08 Aug 2023
Make Explicit Calibration Implicit: Calibrate Denoiser Instead of the
  Noise Model
Make Explicit Calibration Implicit: Calibrate Denoiser Instead of the Noise Model
Xin Jin
Jianqiang Xiao
Linghao Han
Chunle Guo
Xialei Liu
Chongyi Li
Ruixun Zhang
103
3
0
07 Aug 2023
Frustratingly Easy Model Generalization by Dummy Risk Minimization
Frustratingly Easy Model Generalization by Dummy Risk Minimization
Juncheng Wang
Jindong Wang
Xixu Hu
Shujun Wang
Xingxu Xie
58
2
0
04 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
90
1
0
31 Jul 2023
SR-R$^2$KAC: Improving Single Image Defocus Deblurring
SR-R2^22KAC: Improving Single Image Defocus Deblurring
Peng Tang
Zhiqiang Xu
Pengfei Wei
Xiaobin Hu
Peilin Zhao
Xin Cao
Chunlai Zhou
Tobias Lasser
SupR
48
0
0
30 Jul 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMeMLLM
126
46
0
30 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with
  deep learning
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
91
27
0
29 Jul 2023
Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via
  Optimization Trajectory Distillation
Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via Optimization Trajectory Distillation
Jianan Fan
Dongnan Liu
Hang Chang
Heng-Chiao Huang
Mei Chen
Weidong (Tom) Cai
OOD
91
9
0
27 Jul 2023
How to Scale Your EMA
How to Scale Your EMA
Dan Busbridge
Jason Ramapuram
Pierre Ablin
Tatiana Likhomanenko
Eeshan Gunesh Dhekane
Xavier Suau
Russ Webb
82
19
0
25 Jul 2023
The instabilities of large learning rate training: a loss landscape view
The instabilities of large learning rate training: a loss landscape view
Lawrence Wang
Stephen J. Roberts
27
2
0
22 Jul 2023
Improving Transferability of Adversarial Examples via Bayesian Attacks
Improving Transferability of Adversarial Examples via Bayesian Attacks
Qizhang Li
Yiwen Guo
Xiaochen Yang
W. Zuo
Hao Chen
AAMLBDL
73
2
0
21 Jul 2023
FedSoup: Improving Generalization and Personalization in Federated
  Learning via Selective Model Interpolation
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Minghui Chen
Meirui Jiang
Qianming Dou
Zehua Wang
Xiaoxiao Li
FedML
83
16
0
20 Jul 2023
Towards Building More Robust Models with Frequency Bias
Towards Building More Robust Models with Frequency Bias
Qingwen Bu
Dong Huang
Heming Cui
AAML
93
10
0
19 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
45
1
0
18 Jul 2023
FlexiAST: Flexibility is What AST Needs
FlexiAST: Flexibility is What AST Needs
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
55
3
0
18 Jul 2023
DOT: A Distillation-Oriented Trainer
DOT: A Distillation-Oriented Trainer
Borui Zhao
Quan Cui
Renjie Song
Jiajun Liang
60
7
0
17 Jul 2023
Tangent Model Composition for Ensembling and Continual Fine-tuning
Tangent Model Composition for Ensembling and Continual Fine-tuning
Tianlin Liu
Stefano Soatto
LRMMoMeCLL
84
17
0
16 Jul 2023
DISPEL: Domain Generalization via Domain-Specific Liberating
DISPEL: Domain Generalization via Domain-Specific Liberating
Chia-Yuan Chang
Yu-Neng Chuang
Guanchu Wang
Mengnan Du
Zou Na
54
6
0
14 Jul 2023
Layer-wise Linear Mode Connectivity
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedMLFAttMoMe
115
17
0
13 Jul 2023
Unleashing the Potential of Regularization Strategies in Learning with
  Noisy Labels
Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels
Hui-Sung Kang
Sheng Liu
Huaxi Huang
Jun Yu
Bo Han
Dadong Wang
Tongliang Liu
NoLa
90
4
0
11 Jul 2023
On the curvature of the loss landscape
On the curvature of the loss landscape
Alison Pouplin
Hrittik Roy
Sidak Pal Singh
Georgios Arvanitidis
54
1
0
10 Jul 2023
Differentiable Turbulence: Closure as a partial differential equation
  constrained optimization
Differentiable Turbulence: Closure as a partial differential equation constrained optimization
Varun Shankar
Dibyajyoti Chakraborty
V. Viswanathan
R. Maulik
AI4CE
58
1
0
07 Jul 2023
FAM: Relative Flatness Aware Minimization
FAM: Relative Flatness Aware Minimization
Linara Adilova
Amr Abourayya
Jianning Li
Amin Dada
Henning Petzka
Jan Egger
Jens Kleesiek
Michael Kamp
ODL
47
1
0
05 Jul 2023
Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification
  using Dermoscopy and Clinical Images
Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification using Dermoscopy and Clinical Images
Pen Tang
Yang Nan
Tobias Lasser
30
0
0
04 Jul 2023
Sparsity-aware generalization theory for deep neural networks
Sparsity-aware generalization theory for deep neural networks
Ramchandran Muthukumar
Jeremias Sulam
MLT
56
7
0
01 Jul 2023
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models
Uddeshya Upadhyay
Shyamgopal Karthik
Massimiliano Mancini
Zeynep Akata
MLLMVLM
84
4
0
01 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Max Zimmer
Christoph Spiegel
Sebastian Pokutta
MoMe
125
14
0
29 Jun 2023
Low-Confidence Samples Mining for Semi-supervised Object Detection
Low-Confidence Samples Mining for Semi-supervised Object Detection
Guandu Liu
Fan Zhang
Tianxiang Pan
Bin Wang
47
2
0
28 Jun 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
106
12
0
25 Jun 2023
Previous
123...789...192021
Next