Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05407
Cited By
v1
v2
v3 (latest)
Averaging Weights Leads to Wider Optima and Better Generalization
14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Averaging Weights Leads to Wider Optima and Better Generalization"
50 / 1,040 papers shown
Title
Deep Learning Meets Teleconnections: Improving S2S Predictions for European Winter Weather
P. Bommer
M. Kretschmer
Fiona R. Spuler
Kirill Bykov
Marina M.-C. Höhne
AI4Cl
51
1
0
10 Apr 2025
Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems
Luning Sun
Xin-Yang Liu
Siyan Zhao
Aditya Grover
Jian-Xun Wang
Jayaraman J. Thiagarajan
AI4CE
108
0
0
08 Apr 2025
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Weijun Li
Ansh Arora
Xuanli He
Mark Dras
Xingliang Yuan
AAML
MoMe
100
0
0
08 Apr 2025
Generative Classifier for Domain Generalization
Shaocong Long
Qianyu Zhou
Xuelong Li
Chenhao Ying
Yunhai Tong
Lizhuang Ma
Yuan Luo
Dacheng Tao
85
0
0
03 Apr 2025
Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning
Robert Chew
Matthew R. Williams
Elan A. Segarra
Alexander J. Preiss
Amanda Konet
T. Savitsky
86
0
0
27 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
515
0
0
26 Mar 2025
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang
Jian Zhang
Lei Qi
Yinghuan Shi
99
1
0
23 Mar 2025
On Local Posterior Structure in Deep Ensembles
Mikkel Jordahn
Jonas Vestergaard Jensen
Mikkel N. Schmidt
Michael Riis Andersen
UQCV
BDL
OOD
151
0
0
17 Mar 2025
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Mark Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
119
1
0
16 Mar 2025
Robust Dataset Distillation by Matching Adversarial Trajectories
Wei Lai
Tianyu Ding
ren dongdong
Lei Wang
Jing Huo
Yang Gao
Wenbin Li
AAML
DD
102
0
0
15 Mar 2025
Entropy-regularized Gradient Estimators for Approximate Bayesian Inference
Jasmeet Kaur
BDL
UQCV
124
0
0
15 Mar 2025
Understanding Flatness in Generative Models: Its Role and Benefits
Taehwan Lee
Kyeongkook Seo
Jaejun Yoo
Sung Whan Yoon
DiffM
97
0
0
14 Mar 2025
Enhanced Soups for Graph Neural Networks
Joseph Zuber
Aishwarya Sarkar
Joseph Jennings
Ali Jannesari
118
1
0
14 Mar 2025
Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning
Jonathan Shaki
Emanuele La Malfa
Michael Wooldridge
Sarit Kraus
LRM
ReLM
141
0
0
13 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
138
6
0
07 Mar 2025
CrystalFramer: Rethinking the Role of Frames for SE(3)-Invariant Crystal Structure Modeling
Yusei Ito
Tatsunori Taniai
Ryo Igarashi
Yoshitaka Ushiku
K. Ono
106
0
0
04 Mar 2025
Deep Learning is Not So Mysterious or Different
Andrew Gordon Wilson
101
6
0
03 Mar 2025
Gradient-Guided Annealing for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
616
1
0
27 Feb 2025
HALO: Robust Out-of-Distribution Detection via Joint Optimisation
Hugo Lyons Keenan
S. Erfani
Christopher Leckie
OODD
273
0
0
27 Feb 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
120
0
0
26 Feb 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODL
AAML
514
0
0
25 Feb 2025
Low-rank bias, weight decay, and model merging in neural networks
Ilja Kuzborskij
Yasin Abbasi-Yadkori
81
0
0
24 Feb 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
MoMe
VLM
457
1
0
24 Feb 2025
Robust Concept Erasure Using Task Vectors
Minh Pham
Kelly O. Marshall
Chinmay Hegde
Niv Cohen
190
20
0
21 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
327
2
0
18 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
287
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
110
1
0
09 Feb 2025
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection
Yihe Wang
Nan Huang
Nadia Mammone
Marco Cecchi
Xiang Zhang
148
1
0
02 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
212
1
0
01 Feb 2025
Task Arithmetic in Trust Region: A Training-Free Model Merging Approach to Navigate Knowledge Conflicts
Wenju Sun
Qingyong Li
Wen Wang
Yangli-ao Geng
Boyang Li
192
5
0
28 Jan 2025
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen
LRM
131
18
0
28 Jan 2025
Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting
Adrien Lafage
Mathieu Barbier
Gianni Franchi
David Filliat
85
3
0
08 Jan 2025
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Mingli Song
MoMe
236
1
0
03 Jan 2025
Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Sahar Dastani Oghani
Milad Cheraghalikhani
David Osowiech
Farzad Beizaee
G. A. V. Hakim
Ismail ben Ayed
Christian Desrosiers
3DPC
TTA
141
1
0
31 Dec 2024
Two Heads Are Better Than One: Averaging along Fine-Tuning to Improve Targeted Transferability
Hui Zeng
Sanshuai Cui
Biwei Chen
Anjie Peng
AAML
121
0
0
31 Dec 2024
Parameter-Efficient Interventions for Enhanced Model Merging
Marcin Osial
Daniel Marczak
Bartosz Zieliñski
MoMe
150
1
0
22 Dec 2024
Non-Uniform Parameter-Wise Model Merging
Albert Manuel Orozco Camacho
Stefan Horoi
Guy Wolf
Eugene Belilovsky
MoMe
FedML
140
0
0
20 Dec 2024
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li
Liansheng Zhuang
Xiao Long
Minghong Yao
Shafei Wang
515
1
0
18 Dec 2024
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Wenjun Huang
Jianguo Hu
123
0
0
14 Dec 2024
Multi-Head Encoding for Extreme Label Classification
Daojun Liang
Haixia Zhang
Dongfeng Yuan
Minggao Zhang
113
0
0
13 Dec 2024
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
190
7
0
09 Dec 2024
GAQAT: gradient-adaptive quantization-aware training for domain generalization
Jiacheng Jiang
Yuan Meng
Chen Tang
Han Yu
Qun Li
Zhi Wang
Wenwu Zhu
MQ
80
0
0
07 Dec 2024
Towards Understanding the Role of Sharpness-Aware Minimization Algorithms for Out-of-Distribution Generalization
Samuel Schapiro
Han Zhao
137
1
0
06 Dec 2024
Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefits
Daniel Morales-Brotons
Thijs Vogels
Hadrien Hendrikx
190
23
0
27 Nov 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMe
FedML
131
7
0
27 Nov 2024
Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Hao Yi
Qingyang Li
Yihan Hu
Fuzheng Zhang
Di Zhang
Yong Liu
VGen
122
0
0
25 Nov 2024
Multi-Token Enhancing for Vision Representation Learning
Zhong-Yu Li
Yu-Song Hu
Bo Yin
Ming-Ming Cheng
174
1
0
24 Nov 2024
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data
Kai Helli
David Schnurr
Noah Hollmann
Samuel G. Müller
Frank Hutter
OOD
162
10
0
15 Nov 2024
Enhancing generalization in high energy physics using white-box adversarial attacks
Franck Rothen
Samuel Klein
Matthew Leigh
T. Golling
AAML
55
1
0
14 Nov 2024
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Chaeyun Jang
Hyungi Lee
Jungtaek Kim
Juho Lee
MoMe
136
4
0
11 Nov 2024
Previous
1
2
3
4
5
...
19
20
21
Next