ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05407
  4. Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
v1v2v3 (latest)

Averaging Weights Leads to Wider Optima and Better Generalization

14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
    FedMLMoMe
ArXiv (abs)PDFHTML

Papers citing "Averaging Weights Leads to Wider Optima and Better Generalization"

50 / 1,040 papers shown
Title
Boosting Adversarial Transferability for Skeleton-based Action
  Recognition via Exploring the Model Posterior Space
Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space
Yunfeng Diao
Baiqi Wu
Ruixuan Zhang
Xun Yang
Meng Wang
He Wang
85
0
0
11 Jul 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive
  Low-Rank Gradients
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu Zhang
Ajay Jaiswal
L. Yin
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
VLM
75
23
0
11 Jul 2024
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Nikolaos Dimitriadis
Pascal Frossard
François Fleuret
MoE
253
8
0
10 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
116
27
0
08 Jul 2024
Harmony in Diversity: Merging Neural Networks with Canonical Correlation
  Analysis
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Eugene Belilovsky
Guy Wolf
FedMLMoMe
56
10
0
07 Jul 2024
On the power of data augmentation for head pose estimation
On the power of data augmentation for head pose estimation
Michael Welter
CVBM
65
1
0
07 Jul 2024
Finetuning End-to-End Models for Estonian Conversational Spoken Language
  Translation
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation
Tiia Sildam
Andra Velve
Tanel Alumäe
102
0
0
04 Jul 2024
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace
  Training Strategy
Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy
Tao Li
Weisen Jiang
Fanghui Liu
Xiaolin Huang
James T. Kwok
MoMe
94
2
0
04 Jul 2024
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
Amit Peleg
Matthias Hein
74
0
0
04 Jul 2024
Face Reconstruction Transfer Attack as Out-of-Distribution
  Generalization
Face Reconstruction Transfer Attack as Out-of-Distribution Generalization
Yoon Gyo Jung
Jaewoo Park
Xingbo Dong
Hojin Park
Andrew Beng Jin Teoh
Octavia Camps
AAML
100
0
0
02 Jul 2024
DADEE: Well-calibrated uncertainty quantification in neural networks for
  barriers-based robot safety
DADEE: Well-calibrated uncertainty quantification in neural networks for barriers-based robot safety
Masoud Ataei
Vikas Dhiman
57
0
0
30 Jun 2024
Enhancing Accuracy and Parameter-Efficiency of Neural Representations
  for Network Parameterization
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi
Jayaraman J. Thiagarajan
Ruben Glatt
Shusen Liu
87
0
0
29 Jun 2024
Adaptive Stochastic Weight Averaging
Adaptive Stochastic Weight Averaging
Caglar Demir
Arnab Sharma
Axel-Cyrille Ngonga Ngomo
MoMe
67
1
0
27 Jun 2024
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 4: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
Jan van Gemert
VLM
96
0
0
26 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
134
23
0
24 Jun 2024
One-Class Learning with Adaptive Centroid Shift for Audio Deepfake
  Detection
One-Class Learning with Adaptive Centroid Shift for Audio Deepfake Detection
Hyun Myung Kim
Kangwook Jang
Hoirin Kim
74
7
0
24 Jun 2024
Multimodal Multilabel Classification by CLIP
Multimodal Multilabel Classification by CLIP
Yanming Guo
VLM
40
0
0
23 Jun 2024
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving
  Domain Generalization in Histopathology Images
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images
Parastoo Sotoudeh Sharifi
M. Omair Ahmad
M. N. S. Swamy
MoMeOOD
96
0
0
21 Jun 2024
DataFreeShield: Defending Adversarial Attacks without Training Data
DataFreeShield: Defending Adversarial Attacks without Training Data
Hyeyoon Lee
Kanghyun Choi
Dain Kwon
Sunjong Park
Mayoore S. Jaiswal
Noseong Park
Jonghyun Choi
Jinho Lee
78
0
0
21 Jun 2024
DEM: Distribution Edited Model for Training with Mixed Data
  Distributions
DEM: Distribution Edited Model for Training with Mixed Data Distributions
Dhananjay Ram
Aditya Rawal
Momchil Hardalov
Nikolaos Pappas
Sheng Zha
MoMe
138
2
0
21 Jun 2024
Bayesian neural networks for predicting uncertainty in full-field
  material response
Bayesian neural networks for predicting uncertainty in full-field material response
G. Pasparakis
Lori Graham-Brady
Michael D. Shields
AI4CE
86
4
0
21 Jun 2024
Flat Posterior Does Matter For Bayesian Model Averaging
Flat Posterior Does Matter For Bayesian Model Averaging
Sungjun Lim
Jeyoon Yeom
Sooyon Kim
Hoyoon Byun
Jinho Kang
Yohan Jung
Jiyoung Jung
Kyungwoo Song
BDLAAML
114
0
0
21 Jun 2024
MEAT: Median-Ensemble Adversarial Training for Improving Robustness and
  Generalization
MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization
Zhaozhe Hu
Jia-Li Yin
Bin Chen
Luojun Lin
Bo-Hao Chen
Ximeng Liu
AAML
122
0
0
20 Jun 2024
WATT: Weight Average Test-Time Adaptation of CLIP
WATT: Weight Average Test-Time Adaptation of CLIP
David Osowiechi
Mehrdad Noori
G. A. V. Hakim
Moslem Yazdanpanah
Ali Bahri
Milad Cheraghalikhani
Sahar Dastani
Farzad Beizaee
Ismail Ben Ayed
Christian Desrosiers
VLM
77
7
0
19 Jun 2024
Large-Scale Dataset Pruning in Adversarial Training through Data
  Importance Extrapolation
Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation
Bjorn Nieth
Thomas Altstidl
Leo Schwinn
Björn Eskofier
AAML
109
3
0
19 Jun 2024
Efficient Sharpness-Aware Minimization for Molecular Graph Transformer
  Models
Efficient Sharpness-Aware Minimization for Molecular Graph Transformer Models
Yili Wang
Kaixiong Zhou
Ninghao Liu
Ying Wang
Xin Wang
67
10
0
19 Jun 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability
  of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher
Ján Cegin
Róbert Belanec
Jakub Simko
Ivan Srba
Maria Bielikova
83
1
0
18 Jun 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task
  Arithmetic
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
102
23
0
17 Jun 2024
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin Biggs
Arjun Seshadri
Yang Zou
Achin Jain
Aditya Golatkar
Yusheng Xie
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MoMeDiffM
92
13
0
12 Jun 2024
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
131
1
0
12 Jun 2024
Autoregressive Pretraining with Mamba in Vision
Autoregressive Pretraining with Mamba in Vision
Sucheng Ren
Xianhang Li
Haoqin Tu
Feng Wang
Fangxun Shu
...
L. Yang
Peng Wang
Heng Wang
Alan Yuille
Cihang Xie
Mamba
125
12
0
11 Jun 2024
Merging Improves Self-Critique Against Jailbreak Attacks
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
AAMLMoMe
93
4
0
11 Jun 2024
Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation
Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation
Dong Zhao
Shuang Wang
Qi Zang
Licheng Jiao
N. Sebe
Zhun Zhong
119
2
0
10 Jun 2024
MeGA: Merging Multiple Independently Trained Neural Networks Based on
  Genetic Algorithm
MeGA: Merging Multiple Independently Trained Neural Networks Based on Genetic Algorithm
Daniel Yun
FedMLMoMe
46
1
0
07 Jun 2024
CTBENCH: A Library and Benchmark for Certified Training
CTBENCH: A Library and Benchmark for Certified Training
Yuhao Mao
Stefan Balauca
Martin Vechev
OOD
133
5
0
07 Jun 2024
A Universal Class of Sharpness-Aware Minimization Algorithms
A Universal Class of Sharpness-Aware Minimization Algorithms
B. Tahmasebi
Ashkan Soleymani
Dara Bahri
Stefanie Jegelka
Patrick Jaillet
AAML
81
3
0
06 Jun 2024
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Nathan Ng
Roger C. Grosse
Marzyeh Ghassemi
68
0
0
04 Jun 2024
On the Limitations of Fractal Dimension as a Measure of Generalization
On the Limitations of Fractal Dimension as a Measure of Generalization
Charlie Tan
Inés García-Redondo
Qiquan Wang
M. Bronstein
Anthea Monod
AI4CE
58
0
0
04 Jun 2024
Estimating Canopy Height at Scale
Estimating Canopy Height at Scale
Jan Pauls
Max Zimmer
Una M. Kelly
Martin Schwartz
Sassan Saatchi
P. Ciais
Sebastian Pokutta
Martin Brandt
Fabian Gieseke
104
11
0
03 Jun 2024
On the Use of Anchoring for Training Vision Models
On the Use of Anchoring for Training Vision Models
V. Narayanaswamy
Kowshik Thopalli
Rushil Anirudh
Yamen Mubarka
W. Sakla
Jayaraman J. Thiagarajan
95
0
0
01 Jun 2024
Scalable Bayesian Learning with posteriors
Scalable Bayesian Learning with posteriors
Samuel Duffield
Kaelan Donatella
Johnathan Chiu
Phoebe Klett
Daniel Simpson
BDLUQCV
178
2
0
31 May 2024
Weights Augmentation: it has never ever ever ever let her model down
Weights Augmentation: it has never ever ever ever let her model down
Junbin Zhuang
Guiguang Din
Yunyi Yan
97
1
0
30 May 2024
Kernel Semi-Implicit Variational Inference
Kernel Semi-Implicit Variational Inference
Ziheng Cheng
Longlin Yu
Tianyu Xie
Shiyue Zhang
Cheng Zhang
57
7
0
29 May 2024
Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts
Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts
Ruipeng Zhang
Ziqing Fan
Jiangchao Yao
Ya Zhang
Yanfeng Wang
76
7
0
29 May 2024
Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map
  Filtering
Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering
Ido Sobol
Chenfeng Xu
Or Litany
DiffM
80
2
0
29 May 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training
  Durations
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
125
45
0
28 May 2024
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
Zhenxing Niu
Yuyao Sun
Qiguang Miao
Rong Jin
Gang Hua
AAML
70
7
0
28 May 2024
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling,
  then Average
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Louis Fournier
Adel Nabli
Masih Aminbeidokhti
M. Pedersoli
Eugene Belilovsky
Edouard Oyallon
MoMeFedML
95
3
0
27 May 2024
The Road Less Scheduled
The Road Less Scheduled
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
120
60
0
24 May 2024
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Xin-Chun Li
Jinli Tang
Bo Zhang
Lan Li
De-Chuan Zhan
85
2
0
21 May 2024
Previous
12345...192021
Next