Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.01838
Cited By
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
6 November 2016
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Entropy-SGD: Biasing Gradient Descent Into Wide Valleys"
50 / 164 papers shown
Title
MS-Net: Multi-Site Network for Improving Prostate Segmentation with Heterogeneous MRI Data
Quande Liu
Qi Dou
Lequan Yu
Pheng Ann Heng
OOD
71
274
0
09 Feb 2020
'Place-cell' emergence and learning of invariant data with restricted Boltzmann machines: breaking and dynamical restoration of continuous symmetries in the weight space
Moshir Harsh
J. Tubiana
Simona Cocco
R. Monasson
9
14
0
30 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
19
168
0
19 Dec 2019
Improving Model Robustness Using Causal Knowledge
T. Kyono
M. Schaar
OOD
22
12
0
27 Nov 2019
Information-Theoretic Local Minima Characterization and Regularization
Zhiwei Jia
Hao Su
27
19
0
19 Nov 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
36
85
0
09 Oct 2019
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Avraam Chatzimichailidis
Franz-Josef Pfreundt
N. Gauger
J. Keuper
19
10
0
26 Sep 2019
EEG-Based Driver Drowsiness Estimation Using Feature Weighted Episodic Training
Yuqi Cui
Yifan Xu
Dongrui Wu
13
62
0
25 Sep 2019
Understanding and Robustifying Differentiable Architecture Search
Arber Zela
T. Elsken
Tonmoy Saikia
Yassine Marrakchi
Thomas Brox
Frank Hutter
OOD
AAML
66
366
0
20 Sep 2019
Learned imaging with constraints and uncertainty quantification
Felix J. Herrmann
Ali Siahkoohi
G. Rizzuti
UQCV
22
23
0
13 Sep 2019
Knowledge Transfer Graph for Deep Collaborative Learning
Soma Minami
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
28
9
0
10 Sep 2019
Visualizing and Understanding the Effectiveness of BERT
Y. Hao
Li Dong
Furu Wei
Ke Xu
22
181
0
15 Aug 2019
On the Existence of Simpler Machine Learning Models
Lesia Semenova
Cynthia Rudin
Ronald E. Parr
26
85
0
05 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
36
51
0
24 Jul 2019
Post-synaptic potential regularization has potential
Enzo Tartaglione
Daniele Perlo
Marco Grangetto
BDL
AAML
27
6
0
19 Jul 2019
Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets
Amir-Reza Asadi
Emmanuel Abbe
BDL
AI4CE
34
13
0
26 Jun 2019
Learning to Forget for Meta-Learning
Sungyong Baik
Seokil Hong
Kyoung Mu Lee
CLL
KELM
19
87
0
13 Jun 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
Lukas Balles
Philipp Hennig
21
207
0
29 May 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
Multilingual Neural Machine Translation with Knowledge Distillation
Xu Tan
Yi Ren
Di He
Tao Qin
Zhou Zhao
Tie-Yan Liu
20
248
0
27 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Investigating Generalisation in Continuous Deep Reinforcement Learning
Chenyang Zhao
Olivier Sigaud
F. Stulp
Timothy M. Hospedales
OffRL
14
48
0
19 Feb 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
20
237
0
18 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network Learning
Mariya Toneva
Alessandro Sordoni
Rémi Tachet des Combes
Adam Trischler
Yoshua Bengio
Geoffrey J. Gordon
46
712
0
12 Dec 2018
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
28
228
0
12 Dec 2018
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Zhuoning Yuan
Yan Yan
R. L. Jin
Tianbao Yang
52
11
0
10 Dec 2018
Wireless Network Intelligence at the Edge
Jihong Park
S. Samarakoon
M. Bennis
Mérouane Debbah
21
518
0
07 Dec 2018
Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition
Rong Ge
Holden Lee
Andrej Risteski
14
27
0
29 Nov 2018
Single-Label Multi-Class Image Classification by Deep Logistic Regression
Qi Dong
Xiatian Zhu
S. Gong
8
33
0
20 Nov 2018
Sequenced-Replacement Sampling for Deep Learning
C. Ho
Dae Hoon Park
Wei Yang
Yi Chang
24
0
0
19 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
20
243
0
12 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
35
190
0
02 Oct 2018
Interpreting Adversarial Robustness: A View from Decision Surface in Input Space
Fuxun Yu
Chenchen Liu
Yanzhi Wang
Liang Zhao
Xiang Chen
AAML
OOD
31
27
0
29 Sep 2018
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
57
429
0
22 Aug 2018
Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks
Nikola B. Kovachki
Andrew M. Stuart
BDL
42
136
0
10 Aug 2018
Optimization of neural networks via finite-value quantum fluctuations
Masayuki Ohzeki
Shuntaro Okada
Masayoshi Terabe
S. Taguchi
19
21
0
01 Jul 2018
Understanding Dropout as an Optimization Trick
Sangchul Hahn
Heeyoul Choi
ODL
13
34
0
26 Jun 2018
Persistent Hidden States and Nonlinear Transformation for Long Short-Term Memory
Heeyoul Choi
19
12
0
22 Jun 2018
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
22
43
0
17 Jun 2018
The committee machine: Computational to statistical gaps in learning a two-layers neural network
Benjamin Aubin
Antoine Maillard
Jean Barbier
Florent Krzakala
N. Macris
Lenka Zdeborová
41
104
0
14 Jun 2018
Understanding Batch Normalization
Johan Bjorck
Carla P. Gomes
B. Selman
Kilian Q. Weinberger
18
593
0
01 Jun 2018
SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning
W. Wen
Yandan Wang
Feng Yan
Cong Xu
Chunpeng Wu
Yiran Chen
H. Li
24
50
0
21 May 2018
On Visual Hallmarks of Robustness to Adversarial Malware
Alex Huang
Abdullah Al-Dujaili
Erik Hemberg
Una-May O’Reilly
AAML
25
7
0
09 May 2018
Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach
Wenda Zhou
Victor Veitch
Morgane Austern
Ryan P. Adams
Peter Orbanz
38
209
0
16 Apr 2018
Comparing Dynamics: Deep Neural Networks versus Glassy Systems
Marco Baity-Jesi
Levent Sagun
Mario Geiger
S. Spigler
Gerard Ben Arous
C. Cammarota
Yann LeCun
M. Wyart
Giulio Biroli
AI4CE
33
113
0
19 Mar 2018
On the insufficiency of existing momentum schemes for Stochastic Optimization
Rahul Kidambi
Praneeth Netrapalli
Prateek Jain
Sham Kakade
ODL
22
117
0
15 Mar 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
39
1,617
0
14 Mar 2018
Understanding and Enhancing the Transferability of Adversarial Examples
Lei Wu
Zhanxing Zhu
Cheng Tai
E. Weinan
AAML
SILM
28
96
0
27 Feb 2018
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLT
AI4CE
23
630
0
14 Feb 2018
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
98
1,844
0
28 Dec 2017
Previous
1
2
3
4
Next