ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.06559
  4. Cited By
The Power of Interpolation: Understanding the Effectiveness of SGD in
  Modern Over-parametrized Learning

The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning

18 December 2017
Siyuan Ma
Raef Bassily
M. Belkin
ArXivPDFHTML

Papers citing "The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning"

26 / 76 papers shown
Title
Logarithmic Pruning is All You Need
Logarithmic Pruning is All You Need
Laurent Orseau
Marcus Hutter
Omar Rivasplata
28
88
0
22 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
25
74
0
18 Jun 2020
Fine-Grained Analysis of Stability and Generalization for Stochastic
  Gradient Descent
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Yunwen Lei
Yiming Ying
MLT
35
126
0
15 Jun 2020
An Analysis of Constant Step Size SGD in the Non-convex Regime:
  Asymptotic Normality and Bias
An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias
Lu Yu
Krishnakumar Balasubramanian
S. Volgushev
Murat A. Erdogdu
35
50
0
14 Jun 2020
A Unified Theory of Decentralized SGD with Changing Topology and Local
  Updates
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
Anastasia Koloskova
Nicolas Loizou
Sadra Boreiri
Martin Jaggi
Sebastian U. Stich
FedML
41
493
0
23 Mar 2020
On the Convergence of Nesterov's Accelerated Gradient Method in
  Stochastic Settings
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Mahmoud Assran
Michael G. Rabbat
14
59
0
27 Feb 2020
Understanding and Mitigating the Tradeoff Between Robustness and
  Accuracy
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy
Aditi Raghunathan
Sang Michael Xie
Fanny Yang
John C. Duchi
Percy Liang
AAML
48
223
0
25 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
27
181
0
24 Feb 2020
Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved
  Complexities
Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities
Zhongruo Wang
Krishnakumar Balasubramanian
Shiqian Ma
Meisam Razaviyayn
19
25
0
22 Jan 2020
An Image Enhancing Pattern-based Sparsity for Real-time Inference on
  Mobile Devices
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
Xiaolong Ma
Wei Niu
Tianyun Zhang
Sijia Liu
Sheng Lin
...
Xiang Chen
Jian Tang
Kaisheng Ma
Bin Ren
Yanzhi Wang
35
27
0
20 Jan 2020
Stochastic Weight Averaging in Parallel: Large-Batch Training that
  Generalizes Well
Stochastic Weight Averaging in Parallel: Large-Batch Training that Generalizes Well
Vipul Gupta
S. Serrano
D. DeCoste
MoMe
38
55
0
07 Jan 2020
The Role of Neural Network Activation Functions
The Role of Neural Network Activation Functions
Rahul Parhi
Robert D. Nowak
29
12
0
05 Oct 2019
The Error-Feedback Framework: Better Rates for SGD with Delayed
  Gradients and Compressed Communication
The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication
Sebastian U. Stich
Sai Praneeth Karimireddy
FedML
25
20
0
11 Sep 2019
PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for
  Real-time Execution on Mobile Devices
PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices
Xiaolong Ma
Fu-Ming Guo
Wei Niu
Xue Lin
Jian Tang
Kaisheng Ma
Bin Ren
Yanzhi Wang
CVBM
27
173
0
06 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Unified Optimal Analysis of the (Stochastic) Gradient Method
Unified Optimal Analysis of the (Stochastic) Gradient Method
Sebastian U. Stich
26
112
0
09 Jul 2019
Does Learning Require Memorization? A Short Tale about a Long Tail
Does Learning Require Memorization? A Short Tale about a Long Tail
Vitaly Feldman
TDI
52
482
0
12 Jun 2019
Shallow Neural Networks for Fluid Flow Reconstruction with Limited
  Sensors
Shallow Neural Networks for Fluid Flow Reconstruction with Limited Sensors
N. Benjamin Erichson
L. Mathelin
Z. Yao
Steven L. Brunton
Michael W. Mahoney
J. Nathan Kutz
AI4CE
27
34
0
20 Feb 2019
Reconciling modern machine learning practice and the bias-variance
  trade-off
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
60
1,610
0
28 Dec 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an
  Accelerated Perceptron
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark W. Schmidt
30
296
0
16 Oct 2018
Stochastic (Approximate) Proximal Point Methods: Convergence,
  Optimality, and Adaptivity
Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity
Hilal Asi
John C. Duchi
21
123
0
12 Oct 2018
The Effect of Network Width on the Performance of Large-batch Training
The Effect of Network Width on the Performance of Large-batch Training
Lingjiao Chen
Hongyi Wang
Jinman Zhao
Dimitris Papailiopoulos
Paraschos Koutris
21
22
0
11 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a
  Fixed Learning Rate
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedML
MLT
32
97
0
05 Jun 2018
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit
  Regularization
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Navid Azizan
B. Hassibi
24
61
0
04 Jun 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
79
1,044
0
24 May 2018
A Proximal Stochastic Gradient Method with Progressive Variance
  Reduction
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
Lin Xiao
Tong Zhang
ODL
93
737
0
19 Mar 2014
Previous
12