v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015

Moritz Hardt

Benjamin Recht

Y. Singer

ArXiv (abs)PDF HTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown

Title
Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum Junhyung Lyle Kim Panos Toulis Anastasios Kyrillidis 97 8 0 11 Nov 2021
Learning Rates for Nonconvex Pairwise Learning Shaojie Li Yong Liu 95 2 0 09 Nov 2021
Cooperative Deep $Q$ -learning Framework for Environments Providing Image Feedback Krishnan Raghavan Vignesh Narayanan S. Jagannathan VLM OffRL 57 1 0 28 Oct 2021
Learning to Control using Image Feedback Krishnan Raghavan Vignesh Narayanan Jagannathan Saraangapani 37 0 0 28 Oct 2021
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD Bohan Wang Huishuai Zhang Jieyu Zhang Qi Meng Wei Chen Tie-Yan Liu 44 1 0 26 Oct 2021
Fast and Accurate Graph Learning for Huge Data via Minipatch Ensembles Tianyi Yao Minjie Wang Genevera I. Allen 67 1 0 22 Oct 2021
Differentially Private Coordinate Descent for Composite Empirical Risk Minimization Paul Mangold A. Bellet Joseph Salmon Marc Tommasi 109 14 0 22 Oct 2021
Deep Active Learning by Leveraging Training Dynamics Haonan Wang Wei Huang Ziwei Wu A. Margenot Hanghang Tong Jingrui He AI4CE 67 34 0 16 Oct 2021
Towards Statistical and Computational Complexities of Polyak Step Size Gradient Descent Zhaolin Ren Fuheng Cui Alexia Atsidakou Sujay Sanghavi Nhat Ho 43 6 0 15 Oct 2021
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach Qitian Wu Chenxiao Yang Junchi Yan 67 34 0 09 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications Ziqiao Wang Yongyi Mao FedML MLT 124 26 0 07 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization Sara Fridovich-Keil Raphael Gontijo-Lopes Rebecca Roelofs 111 30 0 06 Oct 2021
Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability Aviv Tamar Daniel Soudry E. Zisselman OOD OffRL 51 7 0 24 Sep 2021
Adversarial Representation Learning With Closed-Form Solvers Bashir Sadeghi Lan Wang Vishnu Boddeti 64 5 0 12 Sep 2021
NASI: Label- and Data-agnostic Neural Architecture Search at Initialization Yao Shu Shaofeng Cai Zhongxiang Dai Beng Chin Ooi K. H. Low 98 44 0 02 Sep 2021
The Impact of Reinitialization on Generalization in Convolutional Neural Networks Ibrahim Alabdulmohsin Hartmut Maennel Daniel Keysers AI4CE 61 21 0 01 Sep 2021
Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators Zihang Meng Vikas Singh Sathya Ravi 53 1 0 19 Aug 2021
Stability and Generalization for Randomized Coordinate Descent Puyu Wang Liang Wu Yunwen Lei 58 7 0 17 Aug 2021
Towards Understanding Theoretical Advantages of Complex-Reaction Networks Shao-Qun Zhang Gaoxin Wei Zhi Zhou 58 17 0 15 Aug 2021
Implicit Sparse Regularization: The Impact of Depth and Early Stopping Jiangyuan Li Thanh V. Nguyen Chinmay Hegde R. K. Wong 93 30 0 12 Aug 2021
Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers Liam Hodgkinson Umut Simsekli Rajiv Khanna Michael W. Mahoney 90 23 0 02 Aug 2021
Faster Rates of Private Stochastic Convex Optimization Jinyan Su Lijie Hu Di Wang 102 13 0 31 Jul 2021
Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel Dominic Richards Ilja Kuzborskij 84 29 0 27 Jul 2021
Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization Chiyuan Zhang M. Raghu Jon M. Kleinberg Samy Bengio OOD 111 32 0 27 Jul 2021
Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints Shaojie Li Yong Liu 105 13 0 19 Jul 2021
Transfer Learning in Multi-Agent Reinforcement Learning with Double Q-Networks for Distributed Resource Sharing in V2X Communication Hammad Zafar Zoran Utkovski Martin Kasparick S. Stańczak OffRL 29 3 0 13 Jul 2021
Differentially Private Stochastic Optimization: New Results in Convex and Non-Convex Settings Raef Bassily Cristóbal Guzmán Michael Menart 110 56 0 12 Jul 2021
AdaL: Adaptive Gradient Transformation Contributes to Convergences and Generalizations Hongwei Zhang Weidong Zou Hongbo Zhao Qi Ming Tijin Yan Yuanqing Xia Weipeng Cao ODL 36 0 0 04 Jul 2021
Never Go Full Batch (in Stochastic Convex Optimization) I Zaghloul Amir Y. Carmon Tomer Koren Roi Livni 78 14 0 29 Jun 2021
Optimal Rates for Random Order Online Optimization Uri Sherman Tomer Koren Yishay Mansour 63 8 0 29 Jun 2021
Deep Learning for Functional Data Analysis with Adaptive Basis Layers Ju Yao Jonas W. Mueller Jane-ling Wang 183 27 0 19 Jun 2021
Shuffle Private Stochastic Convex Optimization Albert Cheu Matthew Joseph Jieming Mao Binghui Peng FedML 98 27 0 17 Jun 2021
Towards Understanding Generalization via Decomposing Excess Risk Dynamics Jiaye Teng Jianhao Ma Yang Yuan 68 4 0 11 Jun 2021
Learning subtree pattern importance for Weisfeiler-Lehmanbased graph kernels Dai Hai Nguyen Canh Hao Nguyen Hiroshi Mamitsuka 56 9 0 08 Jun 2021
Stability and Generalization of Bilevel Programming in Hyperparameter Optimization Fan Bao Guoqiang Wu Chongxuan Li Jun Zhu Bo Zhang 85 31 0 08 Jun 2021
What training reveals about neural network complexity Andreas Loukas Marinos Poiitis Stefanie Jegelka 72 11 0 08 Jun 2021
The Randomness of Input Data Spaces is an A Priori Predictor for Generalization Martin Briesch Dominik Sobania Franz Rothlauf UQCV 37 1 0 08 Jun 2021
Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization Qi Deng Wenzhi Gao 88 14 0 06 Jun 2021
Dynamic Scheduling for Over-the-Air Federated Edge Learning with Energy Constraints Yuxuan Sun Sheng Zhou Z. Niu Deniz Gündüz 107 100 0 31 May 2021
On the geometry of generalization and memorization in deep neural networks Cory Stephenson Suchismita Padhy Abhinav Ganesh Yue Hui Hanlin Tang SueYeon Chung TDI AI4CE 85 74 0 30 May 2021
Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems Prateek Jain S. Kowshik Dheeraj M. Nagaraj Praneeth Netrapalli OffRL 95 23 0 24 May 2021
Why Does Multi-Epoch Training Help? Yi Tian Xu Qi Qian Hao Li Rong Jin 69 1 0 13 May 2021
Stability and Generalization of Stochastic Gradient Methods for Minimax Problems Yunwen Lei Zhenhuan Yang Tianbao Yang Yiming Ying 85 48 0 08 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization Saurabh Garg Sivaraman Balakrishnan J. Zico Kolter Zachary Chase Lipton 92 30 0 01 May 2021
Random Reshuffling with Variance Reduction: New Analysis and Better Rates Grigory Malinovsky Alibek Sailanbayev Peter Richtárik 56 21 0 19 Apr 2021
PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging Anthony Sicilia Xingchen Zhao Anastasia Sosnovskikh Seong Jae Hwang BDL UQCV 82 4 0 12 Apr 2021
Optimal Algorithms for Differentially Private Stochastic Monotone Variational Inequalities and Saddle-Point Problems Digvijay Boob Cristóbal Guzmán 64 17 0 07 Apr 2021
Neurons learn slower than they think I. Kulikovskikh 26 0 0 02 Apr 2021
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization Zeke Xie Li-xin Yuan Zhanxing Zhu Masashi Sugiyama 123 30 0 31 Mar 2021
Research of Damped Newton Stochastic Gradient Descent Method for Neural Network Training Jingcheng Zhou Wei Wei Zhiming Zheng ODL 32 0 0 31 Mar 2021