Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.01619
Cited By
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
3 October 2019
Yu Bai
J. Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks"
29 / 29 papers shown
Title
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
Tianyu He
M. Barkeshli
49
4
0
17 Feb 2025
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
36
0
0
13 Oct 2024
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
41
13
0
11 May 2023
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
164
67
0
27 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
26
10
0
21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
26
5
0
20 Oct 2022
Second-order regression models exhibit progressive sharpening to the edge of stability
Atish Agarwala
Fabian Pedregosa
Jeffrey Pennington
27
26
0
10 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
52
6
0
17 Sep 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
27
53
0
04 Aug 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
19
114
0
30 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
27
10
0
08 Jun 2022
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
14
8
0
30 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
34
121
0
03 May 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
Random Feature Amplification: Feature Learning and Generalization in Neural Networks
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
30
29
0
15 Feb 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
61
25
0
06 Dec 2021
Provable Regret Bounds for Deep Online Learning and Control
Xinyi Chen
Edgar Minasyan
Jason D. Lee
Elad Hazan
36
6
0
15 Oct 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
44
38
0
25 Aug 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
28
24
0
11 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan Bill Li
Mihai Nica
Daniel M. Roy
30
33
0
07 Jun 2021
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Zixin Wen
Yuanzhi Li
SSL
MLT
29
131
0
31 May 2021
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
37
18
0
09 Nov 2020
Deep Networks and the Multiple Manifold Problem
Sam Buchanan
D. Gilboa
John N. Wright
166
39
0
25 Aug 2020
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks
Kyungsu Kim
A. Lozano
Eunho Yang
AAML
27
0
0
02 Jul 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Yibo Jiang
C. Pehlevan
19
13
0
30 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
84
11
0
08 Jun 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
11
4
0
22 Feb 2020
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
119
577
0
27 Feb 2015
1