v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015

Moritz Hardt

Benjamin Recht

Y. Singer

ArXiv (abs)PDF HTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

29 / 679 papers shown

Title
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 579 2,948 0 15 Sep 2016
Uniform Generalization, Concentration, and Adaptive Learning Ibrahim Alabdulmohsin FedML 36 2 0 22 Aug 2016
Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back Vitaly Feldman 79 69 0 15 Aug 2016
From Dependence to Causation David Lopez-Paz OOD CML 179 26 0 12 Jul 2016
AdaNet: Adaptive Structural Learning of Artificial Neural Networks Corinna Cortes X. Gonzalvo Vitaly Kuznetsov M. Mohri Scott Yang 103 285 0 05 Jul 2016
On the Expressive Power of Deep Neural Networks M. Raghu Ben Poole Jon M. Kleinberg Surya Ganguli Jascha Narain Sohl-Dickstein 108 791 0 16 Jun 2016
Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics Xi Wu Fengan Li Arun Kumar Kamalika Chaudhuri S. Jha Jeffrey F. Naughton 65 20 0 15 Jun 2016
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation Joel Z Leibo Q. Liao W. Freiwald Fabio Anselmi T. Poggio CVBM 60 56 0 05 Jun 2016
Deep Q-Networks for Accelerating the Training of Deep Neural Networks Jie Fu AI4CE 117 11 0 05 Jun 2016
Fast Zero-Shot Image Tagging Yang Zhang Boqing Gong M. Shah VLM 3DV 74 143 0 31 May 2016
Spectral Methods for Correlated Topic Models Forough Arabshahi Anima Anandkumar OOD 61 2 0 30 May 2016
Alternative asymptotics for cointegration tests in large VARs Junhong Lin Lorenzo Rosasco 77 37 0 28 May 2016
Generalization Properties and Implicit Regularization for Multiple Passes SGM Junhong Lin Raffaello Camoriano Lorenzo Rosasco 86 70 0 26 May 2016
No bad local minima: Data independent training error guarantees for multilayer neural networks Daniel Soudry Y. Carmon 214 235 0 26 May 2016
Swapout: Learning an ensemble of deep architectures Saurabh Singh Derek Hoiem David A. Forsyth BDL 3DPC OOD UQCV 71 150 0 20 May 2016
Stabilized Sparse Online Learning for Sparse Data Yuting Ma Tian Zheng 54 14 0 21 Apr 2016
Challenges in Bayesian Adaptive Data Analysis Sam Elder 96 10 0 08 Apr 2016
Deep Online Convex Optimization with Gated Games David Balduzzi 73 8 0 07 Apr 2016
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity Amit Daniely Roy Frostig Y. Singer 191 345 0 18 Feb 2016
Fast model selection by limiting SVM training times A. Demircioğlu Daniel Horn Tobias Glasmachers B. Bischl C. Weihs 22 0 0 10 Feb 2016
Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms Tom Zahavy Bingyi Kang Alex Sivak Jiashi Feng Huan Xu Shie Mannor OOD AAML 101 12 0 07 Feb 2016
Training Recurrent Neural Networks by Diffusion H. Mobahi ODL 76 46 0 16 Jan 2016
Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization Alon Gonen Shai Shalev-Shwartz 82 25 0 15 Jan 2016
Analysis of Testing-Based Forward Model Selection Damian Kozbur 119 9 0 08 Dec 2015
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford Luke Metz Soumith Chintala GAN OOD 338 14,058 0 19 Nov 2015
On the interplay of network structure and gradient convergence in deep learning V. Ithapu Sathya Ravi Vikas Singh 70 3 0 17 Nov 2015
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs N. Keskar A. Berahas ODL 86 35 0 04 Nov 2015
Semantics, Representations and Grammars for Deep Learning David Balduzzi GNN 41 1 0 29 Sep 2015
Deep Online Convex Optimization by Putting Forecaster to Sleep David Balduzzi 53 3 0 06 Sep 2015