v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015

Moritz Hardt

Benjamin Recht

Y. Singer

ArXiv (abs)PDF HTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown

Title
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study Assaf Dauber M. Feder Tomer Koren Roi Livni 102 24 0 13 Mar 2020
Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives Zhishuai Guo Yan Yan Tianbao Yang MoMe 75 4 0 09 Mar 2020
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations Aditya Golatkar Alessandro Achille Stefano Soatto MU OOD 179 198 0 05 Mar 2020
Disentangling Adaptive Gradient Methods from Learning Rates Naman Agarwal Rohan Anil Elad Hazan Tomer Koren Cyril Zhang 109 38 0 26 Feb 2020
Stagewise Enlargement of Batch Size for SGD-based Learning Shen-Yi Zhao Yin-Peng Xie Wu-Jun Li 47 1 0 26 Feb 2020
Understanding Self-Training for Gradual Domain Adaptation Ananya Kumar Tengyu Ma Percy Liang CLL TTA 100 233 0 26 Feb 2020
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization S. Chatterjee ODL OOD 122 51 0 25 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence Nicolas Loizou Sharan Vaswani I. Laradji Simon Lacoste-Julien 107 189 0 24 Feb 2020
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors A. Banerjee Tiancong Chen Yingxue Zhou BDL 86 8 0 23 Feb 2020
On the generalization of bayesian deep nets for multi-class classification Yossi Adi Yaniv Nemcovsky Alex Schwing Tamir Hazan BDL UQCV 34 1 0 23 Feb 2020
Bounding the expected run-time of nonconvex optimization with early stopping Thomas Flynn K. Yu A. Malik Nicolas DÍmperio Shinjae Yoo 71 2 0 20 Feb 2020
Data Heterogeneity Differential Privacy: From Theory to Algorithm Yilin Kang Jian Li Yong Liu Weiping Wang 76 1 0 20 Feb 2020
Performative Prediction Juan C. Perdomo Tijana Zrnic Celestine Mendler-Dünner Moritz Hardt 203 322 0 16 Feb 2020
Statistical Learning with Conditional Value at Risk Tasuku Soma Yuichi Yoshida 88 38 0 14 Feb 2020
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima Zeke Xie Issei Sato Masashi Sugiyama ODL 127 17 0 10 Feb 2020
On the distance between two neural networks and the stability of learning Jeremy Bernstein Arash Vahdat Yisong Yue Xuan Li ODL 284 59 0 09 Feb 2020
Characterizing Structural Regularities of Labeled Data in Overparameterized Models Ziheng Jiang Chiyuan Zhang Kunal Talwar Michael C. Mozer TDI 68 104 0 08 Feb 2020
The Statistical Complexity of Early-Stopped Mirror Descent Tomas Vaskevicius Varun Kanade Patrick Rebeschini 90 23 0 01 Feb 2020
Reasoning About Generalization via Conditional Mutual Information Thomas Steinke Lydia Zakynthinou 179 166 0 24 Jan 2020
Understanding Why Neural Networks Generalize Well Through GSNR of Parameters Jinlong Liu Guo-qing Jiang Yunzhi Bai Ting Chen Huayan Wang AI4CE 150 50 0 21 Jan 2020
Generalization Bounds for High-dimensional M-estimation under Sparsity Constraint Xiao-Tong Yuan Ping Li 85 2 0 20 Jan 2020
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning Kevin Maik Jablonka D. Ongari S. M. Moosavi B. Smit AI4CE 93 365 0 18 Jan 2020
Understanding Generalization in Deep Learning via Tensor Methods Jingling Li Yanchao Sun Jiahao Su Taiji Suzuki Furong Huang 131 28 0 14 Jan 2020
Poly-time universality and limitations of deep learning Emmanuel Abbe Colin Sandon 62 23 0 07 Jan 2020
Large-scale Kernel Methods and Applications to Lifelong Robot Learning Raffaello Camoriano 84 1 0 11 Dec 2019
Fantastic Generalization Measures and Where to Find Them Yiding Jiang Behnam Neyshabur H. Mobahi Dilip Krishnan Samy Bengio AI4CE 175 613 0 04 Dec 2019
A Generalization Theory based on Independent and Task-Identically Distributed Assumption Guanhua Zheng Jitao Sang Houqiang Li Jian Yu Changsheng Xu OOD 40 1 0 28 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization Shiori Sagawa Pang Wei Koh Tatsunori B. Hashimoto Percy Liang OOD 117 1,254 0 20 Nov 2019
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks Difan Zou Ziniu Hu Yewen Wang Song Jiang Yizhou Sun Quanquan Gu GNN 123 286 0 17 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks Aditya Golatkar Alessandro Achille Stefano Soatto CLL MU 114 508 0 12 Nov 2019
A Comprehensive Comparison of Machine Learning Based Methods Used in Bengali Question Classification Afra Anika Md. Hasibur Rahman Salekul Islam Abu Shafin Mohammad Mahdee Jameel C. R. Rahman 13 2 0 08 Nov 2019
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates Jeffrey Negrea Mahdi Haghifam Gintare Karolina Dziugaite Ashish Khisti Daniel M. Roy FedML 222 153 0 06 Nov 2019
Diametrical Risk Minimization: Theory and Computations Matthew Norton J. Royset 61 19 0 24 Oct 2019
Sharper bounds for uniformly stable algorithms Olivier Bousquet Yegor Klochkov Nikita Zhivotovskiy 71 122 0 17 Oct 2019
The Implicit Regularization of Ordinary Least Squares Ensembles Daniel LeJeune Hamid Javadi Richard G. Baraniuk 143 43 0 10 Oct 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin Colin Wei Tengyu Ma AAML OOD 105 85 0 09 Oct 2019
Partial differential equation regularization for supervised machine learning Jillian R. Fisher 63 2 0 03 Oct 2019
Distributed SGD Generalizes Well Under Asynchrony Jayanth Reddy Regatti Gaurav Tendolkar Yi Zhou Abhishek Gupta Yingbin Liang FedML 39 7 0 29 Sep 2019
Randomized Iterative Methods for Linear Systems: Momentum, Inexactness and Gossip Nicolas Loizou 75 5 0 26 Sep 2019
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network Taiji Suzuki Hiroshi Abe Tomoaki Nishimura AI4CE 81 44 0 25 Sep 2019
On-line Non-Convex Constrained Optimization Olivier Massicot Jakub Mareˇcek 47 13 0 16 Sep 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective Guan-Horng Liu Evangelos A. Theodorou AI4CE 121 72 0 28 Aug 2019
Private Stochastic Convex Optimization with Optimal Rates Raef Bassily Vitaly Feldman Kunal Talwar Abhradeep Thakurta 94 246 0 27 Aug 2019
Path Length Bounds for Gradient Descent and Flow Chirag Gupta Sivaraman Balakrishnan Aaditya Ramdas 152 15 0 02 Aug 2019
Bias of Homotopic Gradient Descent for the Hinge Loss Denali Molitor Deanna Needell Rachel A. Ward 39 6 0 26 Jul 2019
Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions Matthew Faw Rajat Sen Karthikeyan Shanmugam Constantine Caramanis Sanjay Shakkottai 83 3 0 23 Jul 2019
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian Samet Oymak Zalan Fabian Mingchen Li Mahdi Soltanolkotabi MLT 93 89 0 12 Jun 2019
Does Learning Require Memorization? A Short Tale about a Long Tail Vitaly Feldman TDI 210 504 0 12 Jun 2019
Importance Resampling for Off-policy Prediction M. Schlegel Wesley Chung Daniel Graves Jian Qian Martha White OffRL 57 41 0 11 Jun 2019
Understanding Generalization through Visualizations Wenjie Huang Z. Emam Micah Goldblum Liam H. Fowl J. K. Terry Furong Huang Tom Goldstein AI4CE 72 80 0 07 Jun 2019