Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.00307
Cited By
v1
v2 (latest)
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
29 February 2020
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Loss landscapes and optimization in over-parameterized non-linear systems and neural networks"
50 / 168 papers shown
Title
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark Schmidt
58
2
0
02 Apr 2023
Unified analysis of SGD-type methods
Eduard A. Gorbunov
76
2
0
29 Mar 2023
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems
Sihan Zeng
Thinh T. Doan
Justin Romberg
OffRL
60
3
0
23 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
111
70
0
16 Mar 2023
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
94
4
0
06 Mar 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
163
106
0
27 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
105
16
0
18 Feb 2023
Data efficiency and extrapolation trends in neural network interatomic potentials
Joshua A Vita
Daniel Schwalbe-Koda
73
17
0
12 Feb 2023
On the Convergence of Federated Averaging with Cyclic Client Participation
Yae Jee Cho
Pranay Sharma
Gauri Joshi
Zheng Xu
Satyen Kale
Tong Zhang
FedML
108
33
0
06 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models
Michael Arbel
Romain Menegaux
Pierre Wolinski
AI4CE
96
6
0
06 Feb 2023
On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality
Lu Xia
M. Hochstenbach
Stefano Massei
102
2
0
23 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
60
3
0
19 Jan 2023
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis
Le‐Yu Chen
Jing Xu
J.N. Zhang
106
14
0
02 Jan 2023
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
151
26
0
29 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
67
3
0
07 Dec 2022
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems
Zi Xu
Ziqi Wang
Junlin Wang
Y. Dai
106
11
0
24 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
135
101
0
15 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
116
18
0
11 Nov 2022
Neural PDE Solvers for Irregular Domains
Biswajit Khara
Ethan Herron
Zhanhong Jiang
Aditya Balu
Chih-Hsuan Yang
...
Anushrut Jignasu
Soumik Sarkar
Chinmay Hegde
A. Krishnamurthy
Baskar Ganapathysubramanian
AI4CE
52
9
0
07 Nov 2022
Flatter, faster: scaling momentum for optimal speedup of SGD
Aditya Cowsik
T. Can
Paolo Glorioso
100
5
0
28 Oct 2022
Optimization for Amortized Inverse Problems
Tianci Liu
Tong Yang
Quan Zhang
Qi Lei
84
6
0
25 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee
Pedro Cisneros-Velarde
Libin Zhu
M. Belkin
73
8
0
29 Sep 2022
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability
Peisong Wen
Qianqian Xu
Zhiyong Yang
Yuan He
Qingming Huang
136
10
0
27 Sep 2022
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold
Can Yaras
Peng Wang
Zhihui Zhu
Laura Balzano
Qing Qu
66
44
0
19 Sep 2022
BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach
Mao Ye
B. Liu
S. Wright
Peter Stone
Qian Liu
118
90
0
19 Sep 2022
Asymptotic Statistical Analysis of
f
f
f
-divergence GAN
Xinwei Shen
Kani Chen
Tong Zhang
60
2
0
14 Sep 2022
Optimizing the Performative Risk under Weak Convexity Assumptions
Yulai Zhao
76
5
0
02 Sep 2022
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
97
11
0
16 Aug 2022
A Theoretical Analysis of the Learning Dynamics under Class Imbalance
Emanuele Francazi
Marco Baity-Jesi
Aurelien Lucchi
103
18
0
01 Jul 2022
A note on Linear Bottleneck networks and their Transition to Multilinearity
Libin Zhu
Parthe Pandit
M. Belkin
MLT
83
0
0
30 Jun 2022
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks
G. Farhani
Alexander Kazachek
Boyu Wang
102
6
0
29 Jun 2022
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out
Jun-Kun Wang
Chi-Heng Lin
Andre Wibisono
Bin Hu
84
22
0
22 Jun 2022
PRANC: Pseudo RAndom Networks for Compacting deep models
Parsa Nooralinejad
Ali Abbasi
Soroush Abbasi Koohpayegani
Kossar Pourahmadi Meibodi
Rana Muhammad Shahroz Khan
Soheil Kolouri
Hamed Pirsiavash
DD
99
0
0
16 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
84
61
0
02 Jun 2022
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
Eduard A. Gorbunov
Samuel Horváth
Peter Richtárik
Gauthier Gidel
AAML
49
0
0
01 Jun 2022
A Framework for Overparameterized Learning
Dávid Terjék
Diego González-Sánchez
MLT
50
1
0
26 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNN
AI4CE
62
4
0
24 May 2022
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
Konstantinos E. Nikolakakis
Farzin Haddadpour
Amin Karbasi
Dionysios S. Kalogerias
140
19
0
26 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
96
13
0
22 Apr 2022
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
77
22
0
30 Mar 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
72
2
0
21 Mar 2022
Private Non-Convex Federated Learning Without a Trusted Server
Andrew Lowy
Ali Ghafelebashi
Meisam Razaviyayn
FedML
99
27
0
13 Mar 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Chaoyue Liu
Libin Zhu
M. Belkin
53
4
0
10 Mar 2022
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
Pranay Sharma
Rohan Panda
Gauri Joshi
P. Varshney
FedML
114
49
0
09 Mar 2022
From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality
Fusheng Liu
Haizhao Yang
Soufiane Hayou
Qianxiao Li
AI4CE
74
2
0
22 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
61
3
0
28 Jan 2022
Localization in Ensemble Kalman inversion
Xin T. Tong
Matthias Morzfeld
91
21
0
26 Jan 2022
Approximation bounds for norm constrained neural networks with applications to regression and GANs
Yuling Jiao
Yang Wang
Yunfei Yang
85
20
0
24 Jan 2022
Generalization in Supervised Learning Through Riemannian Contraction
L. Kozachkov
Patrick M. Wensing
Jean-Jacques E. Slotine
MLT
91
9
0
17 Jan 2022
Previous
1
2
3
4
Next