ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05407
  4. Cited By
Averaging Weights Leads to Wider Optima and Better Generalization
v1v2v3 (latest)

Averaging Weights Leads to Wider Optima and Better Generalization

14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
    FedMLMoMe
ArXiv (abs)PDFHTML

Papers citing "Averaging Weights Leads to Wider Optima and Better Generalization"

50 / 1,040 papers shown
Title
Deep Learning Meets Teleconnections: Improving S2S Predictions for European Winter Weather
Deep Learning Meets Teleconnections: Improving S2S Predictions for European Winter Weather
P. Bommer
M. Kretschmer
Fiona R. Spuler
Kirill Bykov
Marina M.-C. Höhne
AI4Cl
51
1
0
10 Apr 2025
Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems
Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems
Luning Sun
Xin-Yang Liu
Siyan Zhao
Aditya Grover
Jian-Xun Wang
Jayaraman J. Thiagarajan
AI4CE
108
0
0
08 Apr 2025
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Weijun Li
Ansh Arora
Xuanli He
Mark Dras
Xingliang Yuan
AAMLMoMe
100
0
0
08 Apr 2025
Generative Classifier for Domain Generalization
Generative Classifier for Domain Generalization
Shaocong Long
Qianyu Zhou
Xuelong Li
Chenhao Ying
Yunhai Tong
Lizhuang Ma
Yuan Luo
Dacheng Tao
85
0
0
03 Apr 2025
Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning
Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning
Robert Chew
Matthew R. Williams
Elan A. Segarra
Alexander J. Preiss
Amanda Konet
T. Savitsky
86
0
0
27 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
515
0
0
26 Mar 2025
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization
Xiran Wang
Jian Zhang
Lei Qi
Yinghuan Shi
99
1
0
23 Mar 2025
On Local Posterior Structure in Deep Ensembles
On Local Posterior Structure in Deep Ensembles
Mikkel Jordahn
Jonas Vestergaard Jensen
Mikkel N. Schmidt
Michael Riis Andersen
UQCVBDLOOD
151
0
0
17 Mar 2025
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
Hao Mark Chen
S. Hu
Wayne Luk
Timothy M. Hospedales
Hongxiang Fan
MoMe
119
1
0
16 Mar 2025
Robust Dataset Distillation by Matching Adversarial Trajectories
Robust Dataset Distillation by Matching Adversarial Trajectories
Wei Lai
Tianyu Ding
ren dongdong
Lei Wang
Jing Huo
Yang Gao
Wenbin Li
AAMLDD
102
0
0
15 Mar 2025
Entropy-regularized Gradient Estimators for Approximate Bayesian Inference
Entropy-regularized Gradient Estimators for Approximate Bayesian Inference
Jasmeet Kaur
BDLUQCV
124
0
0
15 Mar 2025
Understanding Flatness in Generative Models: Its Role and Benefits
Taehwan Lee
Kyeongkook Seo
Jaejun Yoo
Sung Whan Yoon
DiffM
97
0
0
14 Mar 2025
Enhanced Soups for Graph Neural Networks
Joseph Zuber
Aishwarya Sarkar
Joseph Jennings
Ali Jannesari
118
1
0
14 Mar 2025
Understanding the Logical Capabilities of Large Language Models via Out-of-Context Representation Learning
Jonathan Shaki
Emanuele La Malfa
Michael Wooldridge
Sarit Kraus
LRMReLM
141
0
0
13 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
138
6
0
07 Mar 2025
CrystalFramer: Rethinking the Role of Frames for SE(3)-Invariant Crystal Structure Modeling
Yusei Ito
Tatsunori Taniai
Ryo Igarashi
Yoshitaka Ushiku
K. Ono
106
0
0
04 Mar 2025
Deep Learning is Not So Mysterious or Different
Andrew Gordon Wilson
101
6
0
03 Mar 2025
Gradient-Guided Annealing for Domain Generalization
Gradient-Guided Annealing for Domain Generalization
Aristotelis Ballas
Christos Diou
OOD
616
1
0
27 Feb 2025
HALO: Robust Out-of-Distribution Detection via Joint Optimisation
HALO: Robust Out-of-Distribution Detection via Joint Optimisation
Hugo Lyons Keenan
S. Erfani
Christopher Leckie
OODD
273
0
0
27 Feb 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
120
0
0
26 Feb 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODLAAML
514
0
0
25 Feb 2025
Low-rank bias, weight decay, and model merging in neural networks
Ilja Kuzborskij
Yasin Abbasi-Yadkori
81
0
0
24 Feb 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
MoMeVLM
457
1
0
24 Feb 2025
Robust Concept Erasure Using Task Vectors
Robust Concept Erasure Using Task Vectors
Minh Pham
Kelly O. Marshall
Chinmay Hegde
Niv Cohen
190
20
0
21 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMeFedML
327
2
0
18 Feb 2025
When, Where and Why to Average Weights?
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
287
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
110
1
0
09 Feb 2025
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection
Yihe Wang
Nan Huang
Nadia Mammone
Marco Cecchi
Xiang Zhang
148
1
0
02 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
212
1
0
01 Feb 2025
Task Arithmetic in Trust Region: A Training-Free Model Merging Approach to Navigate Knowledge Conflicts
Wenju Sun
Qingyong Li
Wen Wang
Yangli-ao Geng
Boyang Li
192
5
0
28 Jan 2025
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Siyuan Lu
Yaliang Li
Ji-Rong Wen
LRM
131
18
0
28 Jan 2025
Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting
Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting
Adrien Lafage
Mathieu Barbier
Gianni Franchi
David Filliat
85
3
0
08 Jan 2025
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Mingli Song
MoMe
236
1
0
03 Jan 2025
Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging
Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Sahar Dastani Oghani
Milad Cheraghalikhani
David Osowiech
Farzad Beizaee
G. A. V. Hakim
Ismail ben Ayed
Christian Desrosiers
3DPCTTA
141
1
0
31 Dec 2024
Two Heads Are Better Than One: Averaging along Fine-Tuning to Improve Targeted Transferability
Two Heads Are Better Than One: Averaging along Fine-Tuning to Improve Targeted Transferability
Hui Zeng
Sanshuai Cui
Biwei Chen
Anjie Peng
AAML
121
0
0
31 Dec 2024
Parameter-Efficient Interventions for Enhanced Model Merging
Parameter-Efficient Interventions for Enhanced Model Merging
Marcin Osial
Daniel Marczak
Bartosz Zieliñski
MoMe
150
1
0
22 Dec 2024
Non-Uniform Parameter-Wise Model Merging
Non-Uniform Parameter-Wise Model Merging
Albert Manuel Orozco Camacho
Stefan Horoi
Guy Wolf
Eugene Belilovsky
MoMeFedML
140
0
0
20 Dec 2024
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li
Liansheng Zhuang
Xiao Long
Minghong Yao
Shafei Wang
515
1
0
18 Dec 2024
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision
  Performance
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Wenjun Huang
Jianguo Hu
123
0
0
14 Dec 2024
Multi-Head Encoding for Extreme Label Classification
Multi-Head Encoding for Extreme Label Classification
Daojun Liang
Haixia Zhang
Dongfeng Yuan
Minggao Zhang
113
0
0
13 Dec 2024
How to Merge Your Multimodal Models Over Time?
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
190
7
0
09 Dec 2024
GAQAT: gradient-adaptive quantization-aware training for domain
  generalization
GAQAT: gradient-adaptive quantization-aware training for domain generalization
Jiacheng Jiang
Yuan Meng
Chen Tang
Han Yu
Qun Li
Zhi Wang
Wenwu Zhu
MQ
80
0
0
07 Dec 2024
Towards Understanding the Role of Sharpness-Aware Minimization
  Algorithms for Out-of-Distribution Generalization
Towards Understanding the Role of Sharpness-Aware Minimization Algorithms for Out-of-Distribution Generalization
Samuel Schapiro
Han Zhao
137
1
0
06 Dec 2024
Exponential Moving Average of Weights in Deep Learning: Dynamics and
  Benefits
Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefits
Daniel Morales-Brotons
Thijs Vogels
Hadrien Hendrikx
190
23
0
27 Nov 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMeFedML
131
7
0
27 Nov 2024
Video-Text Dataset Construction from Multi-AI Feedback: Promoting
  Weak-to-Strong Preference Learning for Video Large Language Models
Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models
Hao Yi
Qingyang Li
Yihan Hu
Fuzheng Zhang
Di Zhang
Yong Liu
VGen
122
0
0
25 Nov 2024
Multi-Token Enhancing for Vision Representation Learning
Multi-Token Enhancing for Vision Representation Learning
Zhong-Yu Li
Yu-Song Hu
Bo Yin
Ming-Ming Cheng
174
1
0
24 Nov 2024
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts
  on Tabular Data
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data
Kai Helli
David Schnurr
Noah Hollmann
Samuel G. Müller
Frank Hutter
OOD
162
10
0
15 Nov 2024
Enhancing generalization in high energy physics using white-box
  adversarial attacks
Enhancing generalization in high energy physics using white-box adversarial attacks
Franck Rothen
Samuel Klein
Matthew Leigh
T. Golling
AAML
55
1
0
14 Nov 2024
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Chaeyun Jang
Hyungi Lee
Jungtaek Kim
Juho Lee
MoMe
136
4
0
11 Nov 2024
Previous
12345...192021
Next