ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11560
  4. Cited By
Deep Learning is Singular, and That's Good

Deep Learning is Singular, and That's Good

22 October 2020
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
    UQCV
ArXivPDFHTML

Papers citing "Deep Learning is Singular, and That's Good"

33 / 33 papers shown
Title
Review and Prospect of Algebraic Research in Equivalent Framework between Statistical Mechanics and Machine Learning Theory
Review and Prospect of Algebraic Research in Equivalent Framework between Statistical Mechanics and Machine Learning Theory
Sumio Watanabe
72
1
0
31 May 2024
Rethinking Parameter Counting in Deep Models: Effective Dimensionality
  Revisited
Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited
Wesley J. Maddox
Gregory W. Benton
A. Wilson
109
61
0
04 Mar 2020
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Agustinus Kristiadi
Matthias Hein
Philipp Hennig
BDL
UQCV
77
285
0
24 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
522
4,773
0
23 Jan 2020
On the interplay between noise and curvature and its effect on
  optimization and generalization
On the interplay between noise and curvature and its effect on optimization and generalization
Valentin Thomas
Fabian Pedregosa
B. V. Merrienboer
Pierre-Antoine Mangazol
Yoshua Bengio
Nicolas Le Roux
40
61
0
18 Jun 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep
  Neural Networks
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLT
AI4CE
80
389
0
30 May 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
66
182
0
01 Apr 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
167
971
0
24 Jan 2019
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
161
772
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
220
1,461
0
09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
192
1,270
0
04 Oct 2018
Learning Overparameterized Neural Networks via Stochastic Gradient
  Descent on Structured Data
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
190
653
0
03 Aug 2018
Stochastic natural gradient descent draws posterior samples in function
  space
Stochastic natural gradient descent draws posterior samples in function space
Samuel L. Smith
Daniel Duckworth
Semon Rezchikov
Quoc V. Le
Jascha Narain Sohl-Dickstein
BDL
35
6
0
25 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
234
3,191
0
20 Jun 2018
Energy-entropy competition and the effectiveness of stochastic gradient
  descent in machine learning
Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Yao Zhang
Andrew M. Saxe
Madhu S. Advani
A. Lee
52
60
0
05 Mar 2018
Stronger generalization bounds for deep nets via a compression approach
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLT
AI4CE
84
639
0
14 Feb 2018
Theory of Deep Learning III: explaining the non-overfitting puzzle
Theory of Deep Learning III: explaining the non-overfitting puzzle
T. Poggio
Kenji Kawaguchi
Q. Liao
Brando Miranda
Lorenzo Rosasco
Xavier Boix
Jack Hidary
H. Mhaskar
ODL
53
128
0
30 Dec 2017
Deep Learning Scaling is Predictable, Empirically
Deep Learning Scaling is Predictable, Empirically
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
87
736
0
01 Dec 2017
Three Factors Influencing Minima in SGD
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
SGD Learns Over-parameterized Networks that Provably Generalize on
  Linearly Separable Data
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
149
279
0
27 Oct 2017
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Samuel L. Smith
Quoc V. Le
BDL
61
251
0
17 Oct 2017
Searching for Activation Functions
Searching for Activation Functions
Prajit Ramachandran
Barret Zoph
Quoc V. Le
62
606
0
16 Oct 2017
Exploring Generalization in Deep Learning
Exploring Generalization in Deep Learning
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
FAtt
141
1,251
0
27 Jun 2017
Spectrally-normalized margin bounds for neural networks
Spectrally-normalized margin bounds for neural networks
Peter L. Bartlett
Dylan J. Foster
Matus Telgarsky
ODL
177
1,216
0
26 Jun 2017
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic
  Differential Equations for Markov Chain Monte Carlo
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo
Umut Simsekli
58
45
0
12 Jun 2017
Stochastic Gradient Descent as Approximate Bayesian Inference
Stochastic Gradient Descent as Approximate Bayesian Inference
Stephan Mandt
Matthew D. Hoffman
David M. Blei
BDL
52
598
0
13 Apr 2017
SGD Learns the Conjugate Kernel Class of the Network
SGD Learns the Conjugate Kernel Class of the Network
Amit Daniely
150
181
0
27 Feb 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
310
4,623
0
10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
94
773
0
06 Nov 2016
Gaussian Error Linear Units (GELUs)
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
165
4,984
0
27 Jun 2016
Norm-Based Capacity Control in Neural Networks
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
265
587
0
27 Feb 2015
A Widely Applicable Bayesian Information Criterion
A Widely Applicable Bayesian Information Criterion
Sumio Watanabe
83
781
0
31 Aug 2012
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian
  Monte Carlo
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo
Matthew D. Hoffman
Andrew Gelman
162
4,295
0
18 Nov 2011
1