Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.04933
Cited By
Sharp Minima Can Generalize For Deep Nets
15 March 2017
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Sharp Minima Can Generalize For Deep Nets"
50 / 165 papers shown
Title
Sufficient Invariant Learning for Distribution Shift
Taero Kim
Sungjun Lim
Kyungwoo Song
OOD
31
2
0
24 Oct 2022
Rethinking Sharpness-Aware Minimization as Variational Inference
Szilvia Ujváry
Zsigmond Telek
A. Kerekes
Anna Mészáros
Ferenc Huszár
30
8
0
19 Oct 2022
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models
Nikolaos Dimitriadis
P. Frossard
Franccois Fleuret
26
25
0
18 Oct 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
27
69
0
11 Oct 2022
The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima
Peter L. Bartlett
Philip M. Long
Olivier Bousquet
73
34
0
04 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sungyub Kim
Si-hun Park
Kyungsu Kim
Eunho Yang
BDL
26
4
0
30 Sep 2022
Deep Double Descent via Smooth Interpolation
Matteo Gamba
Erik Englesson
Marten Bjorkman
Hossein Azizpour
63
10
0
21 Sep 2022
Learning Symbolic Model-Agnostic Loss Functions via Meta-Learning
Christian Raymond
Qi Chen
Bing Xue
Mengjie Zhang
FedML
29
11
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
34
72
0
26 Aug 2022
A Deep Learning Approach for the solution of Probability Density Evolution of Stochastic Systems
S. Pourtakdoust
Amir H. Khodabakhsh
33
12
0
05 Jul 2022
On Leave-One-Out Conditional Mutual Information For Generalization
Mohamad Rida Rammal
Alessandro Achille
Aditya Golatkar
Suhas Diggavi
Stefano Soatto
VLM
28
5
0
01 Jul 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
69
27
0
17 Jun 2022
Efficiently Training Low-Curvature Neural Networks
Suraj Srinivas
Kyle Matoba
Himabindu Lakkaraju
F. Fleuret
AAML
23
15
0
14 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
37
69
0
14 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
26
133
0
13 Jun 2022
Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion
Chengli Tan
Jiang Zhang
Junmin Liu
35
1
0
09 Jun 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
237
45
0
24 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
25
27
0
24 Apr 2022
Small Batch Sizes Improve Training of Low-Resource Neural MT
Àlex R. Atrio
Andrei Popescu-Belis
27
6
0
20 Mar 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
19
166
0
11 Mar 2022
Adversarial robustness of sparse local Lipschitz predictors
Ramchandran Muthukumar
Jeremias Sulam
AAML
32
13
0
26 Feb 2022
On PAC-Bayesian reconstruction guarantees for VAEs
Badr-Eddine Chérief-Abdellatif
Yuyang Shi
Arnaud Doucet
Benjamin Guedj
DRL
45
17
0
23 Feb 2022
Tackling benign nonconvexity with smoothing and stochastic gradients
Harsh Vardhan
Sebastian U. Stich
23
8
0
18 Feb 2022
A Geometric Understanding of Natural Gradient
Qinxun Bai
S. Rosenberg
Wei Xu
21
2
0
13 Feb 2022
Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning
Yang Zhao
Hao Zhang
Xiuyuan Hu
30
116
0
08 Feb 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurelien Lucchi
53
44
0
06 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
18
58
0
01 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
8
0
31 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Neighborhood Region Smoothing Regularization for Finding Flat Minima In Deep Neural Networks
Yang Zhao
Hao Zhang
22
1
0
16 Jan 2022
Visualizing the Loss Landscape of Winning Lottery Tickets
Robert Bain
UQCV
25
3
0
16 Dec 2021
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective
Xiaowu Dai
Yuhua Zhu
25
4
0
02 Dec 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
26
4
0
07 Nov 2021
Using Graph-Theoretic Machine Learning to Predict Human Driver Behavior
Rohan Chandra
Aniket Bera
Dinesh Manocha
48
22
0
04 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
24
14
0
01 Nov 2021
Hyper-Representations: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction
Konstantin Schurholt
Dimche Kostadinov
Damian Borth
SSL
19
14
0
28 Oct 2021
Does the Data Induce Capacity Control in Deep Learning?
Rubing Yang
J. Mao
Pratik Chaudhari
25
15
0
27 Oct 2021
Towards Better Plasticity-Stability Trade-off in Incremental Learning: A Simple Linear Connector
Guoliang Lin
Hanlu Chu
Hanjiang Lai
MoMe
CLL
29
43
0
15 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
Ziqiao Wang
Yongyi Mao
FedML
MLT
37
22
0
07 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
40
5
0
01 Oct 2021
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Alexandre Ramé
Corentin Dancette
Matthieu Cord
OOD
38
204
0
07 Sep 2021
Shift-Curvature, SGD, and Generalization
Arwen V. Bradley
C. Gomez-Uribe
Manish Reddy Vuyyuru
32
2
0
21 Aug 2021
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
34
1
0
12 Aug 2021
Batch Normalization Preconditioning for Neural Network Training
Susanna Lange
Kyle E. Helfrich
Qiang Ye
27
9
0
02 Aug 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Implicit Gradient Alignment in Distributed and Federated Learning
Yatin Dandi
Luis Barba
Martin Jaggi
FedML
18
31
0
25 Jun 2021
Minimum sharpness: Scale-invariant parameter-robustness of neural networks
Hikaru Ibayashi
Takuo Hamaguchi
Masaaki Imaizumi
25
5
0
23 Jun 2021
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators
David Stutz
Nandhini Chandramoorthy
Matthias Hein
Bernt Schiele
AAML
MQ
22
18
0
16 Apr 2021
Relating Adversarially Robust Generalization to Flat Minima
David Stutz
Matthias Hein
Bernt Schiele
OOD
29
65
0
09 Apr 2021
Previous
1
2
3
4
Next