ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.04754
  4. Cited By
Gradient Descent Happens in a Tiny Subspace

Gradient Descent Happens in a Tiny Subspace

12 December 2018
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
ArXivPDFHTML

Papers citing "Gradient Descent Happens in a Tiny Subspace"

13 / 163 papers shown
Title
Quantum algorithm for finding the negative curvature direction in
  non-convex optimization
Quantum algorithm for finding the negative curvature direction in non-convex optimization
Kaining Zhang
Min-hsiu Hsieh
Liu Liu
Dacheng Tao
18
3
0
17 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Subspace Inference for Bayesian Deep Learning
Subspace Inference for Bayesian Deep Learning
Pavel Izmailov
Wesley J. Maddox
Polina Kirichenko
T. Garipov
Dmitry Vetrov
A. Wilson
UQCV
BDL
38
143
0
17 Jul 2019
SGD momentum optimizer with step estimation by online parabola model
SGD momentum optimizer with step estimation by online parabola model
J. Duda
ODL
21
22
0
16 Jul 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
A Simple Baseline for Bayesian Uncertainty in Deep Learning
A Simple Baseline for Bayesian Uncertainty in Deep Learning
Wesley J. Maddox
T. Garipov
Pavel Izmailov
Dmitry Vetrov
A. Wilson
BDL
UQCV
45
796
0
07 Feb 2019
Negative eigenvalues of the Hessian in deep neural networks
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Nicolas Le Roux
Pierre-Antoine Manzagol
24
42
0
06 Feb 2019
Improving SGD convergence by online linear regression of gradients in
  multiple statistically relevant directions
Improving SGD convergence by online linear regression of gradients in multiple statistically relevant directions
J. Duda
ODL
12
1
0
31 Jan 2019
An Investigation into Neural Net Optimization via Hessian Eigenvalue
  Density
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani
Shankar Krishnan
Ying Xiao
ODL
18
317
0
29 Jan 2019
Measurements of Three-Level Hierarchical Structure in the Outliers in
  the Spectrum of Deepnet Hessians
Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians
Vardan Papyan
30
87
0
24 Jan 2019
An Empirical Model of Large-Batch Training
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
24
271
0
14 Dec 2018
A Modern Take on the Bias-Variance Tradeoff in Neural Networks
A Modern Take on the Bias-Variance Tradeoff in Neural Networks
Brady Neal
Sarthak Mittal
A. Baratin
Vinayak Tantia
Matthew Scicluna
Simon Lacoste-Julien
Ioannis Mitliagkas
37
167
0
19 Oct 2018
There Are Many Consistent Explanations of Unlabeled Data: Why You Should
  Average
There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average
Ben Athiwaratkun
Marc Finzi
Pavel Izmailov
A. Wilson
208
243
0
14 Jun 2018
Previous
1234