Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.00519
Cited By
Stochastic Weight Averaging Revisited
3 January 2022
Hao Guo
Jiyong Jin
B. Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic Weight Averaging Revisited"
20 / 20 papers shown
Title
A Model Zoo of Vision Transformers
Damian Falk
Léo Meynent
Florence Pfammatter
Konstantin Schurholt
Damian Borth
34
0
0
14 Apr 2025
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi
Jayaraman J. Thiagarajan
Ruben Glatt
Shusen Liu
43
0
0
29 Jun 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Interpretable Time Series Models for Wastewater Modeling in Combined Sewer Overflows
Teodor Chiaburu
Felix Bießmann
AI4TS
AI4CE
25
3
0
04 Jan 2024
Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free Ensembles of DNNs
Uri Stern
D. Weinshall
CLL
29
0
0
17 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
33
52
0
27 Sep 2023
The Split Matters: Flat Minima Methods for Improving the Performance of GNNs
N. Lell
A. Scherp
43
1
0
15 Jun 2023
A Boosted Model Ensembling Approach to Ball Action Spotting in Videos: The Runner-Up Solution to CVPR'23 SoccerNet Challenge
Luping Wang
Hao Guo
B. Liu
35
3
0
09 Jun 2023
Improving Energy Conserving Descent for Machine Learning: Theory and Practice
G. Luca
Alice Gatti
E. Silverstein
20
1
0
01 Jun 2023
A Survey of Historical Learning: Learning Models with Learning History
Xiang Li
Ge Wu
Lingfeng Yang
Wenzhe Wang
Renjie Song
Jian Yang
MU
AI4TS
31
2
0
23 Mar 2023
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
46
94
0
15 Nov 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
24
39
0
29 Sep 2022
Learning Gradient-based Mixup towards Flatter Minima for Domain Generalization
Danni Peng
Sinno Jialin Pan
34
2
0
29 Sep 2022
Two-Tailed Averaging: Anytime, Adaptive, Once-in-a-While Optimal Weight Averaging for Better Generalization
Gábor Melis
MoMe
36
1
0
26 Sep 2022
Improving Predictive Performance and Calibration by Weight Fusion in Semantic Segmentation
Timo Sämann
A. Hammam
Andrei Bursuc
Christoph Stiller
H. Groß
FedML
38
1
0
22 Jul 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
199
128
0
19 May 2022
PFGE: Parsimonious Fast Geometric Ensembling of DNNs
Hao Guo
Jiyong Jin
B. Liu
FedML
32
1
0
14 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
24
58
0
01 Feb 2022
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
216
423
0
17 Feb 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1