ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00307
  4. Cited By
Loss landscapes and optimization in over-parameterized non-linear
  systems and neural networks
v1v2 (latest)

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks

29 February 2020
Chaoyue Liu
Libin Zhu
M. Belkin
    ODL
ArXiv (abs)PDFHTML

Papers citing "Loss landscapes and optimization in over-parameterized non-linear systems and neural networks"

50 / 168 papers shown
Title
Fast Convergence of Random Reshuffling under Over-Parameterization and
  the Polyak-Łojasiewicz Condition
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark Schmidt
58
2
0
02 Apr 2023
Unified analysis of SGD-type methods
Unified analysis of SGD-type methods
Eduard A. Gorbunov
76
2
0
29 Mar 2023
Connected Superlevel Set in (Deep) Reinforcement Learning and its
  Application to Minimax Theorems
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems
Sihan Zeng
Thinh T. Doan
Justin Romberg
OffRL
60
3
0
23 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
111
70
0
16 Mar 2023
Critical Points and Convergence Analysis of Generative Deep Linear
  Networks Trained with Bures-Wasserstein Loss
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
94
4
0
06 Mar 2023
Full Stack Optimization of Transformer Inference: a Survey
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
163
106
0
27 Feb 2023
Generalization and Stability of Interpolating Neural Networks with
  Minimal Width
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
105
16
0
18 Feb 2023
Data efficiency and extrapolation trends in neural network interatomic
  potentials
Data efficiency and extrapolation trends in neural network interatomic potentials
Joshua A Vita
Daniel Schwalbe-Koda
73
17
0
12 Feb 2023
On the Convergence of Federated Averaging with Cyclic Client
  Participation
On the Convergence of Federated Averaging with Cyclic Client Participation
Yae Jee Cho
Pranay Sharma
Gauri Joshi
Zheng Xu
Satyen Kale
Tong Zhang
FedML
108
33
0
06 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models
Rethinking Gauss-Newton for learning over-parameterized models
Michael Arbel
Romain Menegaux
Pierre Wolinski
AI4CE
96
6
0
06 Feb 2023
On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality
On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality
Lu Xia
M. Hochstenbach
Stefano Massei
102
2
0
23 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh
  quotients
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
60
3
0
19 Jan 2023
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness
  Results and Improved Analysis
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis
Le‐Yu Chen
Jing Xu
J.N. Zhang
106
14
0
02 Jan 2023
Bayesian Interpolation with Deep Linear Networks
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
151
26
0
29 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
  Evasion of Non-Degenerate Saddle Points
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
67
3
0
07 Dec 2022
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class
  of Nonconvex-Nonconcave Minimax Problems
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems
Zi Xu
Ziqi Wang
Junlin Wang
Y. Dai
106
11
0
24 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
135
101
0
15 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
116
18
0
11 Nov 2022
Neural PDE Solvers for Irregular Domains
Neural PDE Solvers for Irregular Domains
Biswajit Khara
Ethan Herron
Zhanhong Jiang
Aditya Balu
Chih-Hsuan Yang
...
Anushrut Jignasu
Soumik Sarkar
Chinmay Hegde
A. Krishnamurthy
Baskar Ganapathysubramanian
AI4CE
52
9
0
07 Nov 2022
Flatter, faster: scaling momentum for optimal speedup of SGD
Flatter, faster: scaling momentum for optimal speedup of SGD
Aditya Cowsik
T. Can
Paolo Glorioso
100
5
0
28 Oct 2022
Optimization for Amortized Inverse Problems
Optimization for Amortized Inverse Problems
Tianci Liu
Tong Yang
Quan Zhang
Qi Lei
84
6
0
25 Oct 2022
On skip connections and normalisation layers in deep optimisation
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth
  Activations
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee
Pedro Cisneros-Velarde
Libin Zhu
M. Belkin
73
8
0
29 Sep 2022
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
  with List Stability
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability
Peisong Wen
Qianqian Xu
Zhiyong Yang
Yuan He
Qingming Huang
136
10
0
27 Sep 2022
Neural Collapse with Normalized Features: A Geometric Analysis over the
  Riemannian Manifold
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold
Can Yaras
Peng Wang
Zhihui Zhu
Laura Balzano
Qing Qu
66
44
0
19 Sep 2022
BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach
BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach
Mao Ye
B. Liu
S. Wright
Peter Stone
Qian Liu
118
90
0
19 Sep 2022
Asymptotic Statistical Analysis of $f$-divergence GAN
Asymptotic Statistical Analysis of fff-divergence GAN
Xinwei Shen
Kani Chen
Tong Zhang
60
2
0
14 Sep 2022
Optimizing the Performative Risk under Weak Convexity Assumptions
Optimizing the Performative Risk under Weak Convexity Assumptions
Yulai Zhao
76
5
0
02 Sep 2022
On the generalization of learning algorithms that do not converge
On the generalization of learning algorithms that do not converge
N. Chandramoorthy
Andreas Loukas
Khashayar Gatmiry
Stefanie Jegelka
MLT
97
11
0
16 Aug 2022
A Theoretical Analysis of the Learning Dynamics under Class Imbalance
A Theoretical Analysis of the Learning Dynamics under Class Imbalance
Emanuele Francazi
Marco Baity-Jesi
Aurelien Lucchi
103
18
0
01 Jul 2022
A note on Linear Bottleneck networks and their Transition to
  Multilinearity
A note on Linear Bottleneck networks and their Transition to Multilinearity
Libin Zhu
Parthe Pandit
M. Belkin
MLT
83
0
0
30 Jun 2022
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
  Neural Networks
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks
G. Farhani
Alexander Kazachek
Boyu Wang
102
6
0
29 Jun 2022
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of
  Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out
Jun-Kun Wang
Chi-Heng Lin
Andre Wibisono
Bin Hu
84
22
0
22 Jun 2022
PRANC: Pseudo RAndom Networks for Compacting deep models
PRANC: Pseudo RAndom Networks for Compacting deep models
Parsa Nooralinejad
Ali Abbasi
Soroush Abbasi Koohpayegani
Kossar Pourahmadi Meibodi
Rana Muhammad Shahroz Khan
Soheil Kolouri
Hamed Pirsiavash
DD
99
0
0
16 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
84
61
0
02 Jun 2022
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker
  Assumptions and Communication Compression as a Cherry on the Top
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
Eduard A. Gorbunov
Samuel Horváth
Peter Richtárik
Gauthier Gidel
AAML
49
0
0
01 Jun 2022
A Framework for Overparameterized Learning
A Framework for Overparameterized Learning
Dávid Terjék
Diego González-Sánchez
MLT
50
1
0
26 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic
  Graph Architecture
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNNAI4CE
62
4
0
24 May 2022
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for
  Full-Batch GD
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
Konstantinos E. Nikolakakis
Farzin Haddadpour
Amin Karbasi
Dionysios S. Kalogerias
140
19
0
26 Apr 2022
On Feature Learning in Neural Networks with Global Convergence
  Guarantees
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
96
13
0
22 Apr 2022
Convergence of gradient descent for deep neural networks
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
77
22
0
30 Mar 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in
  Non-Convex Optimization With Non-isolated Local Minima
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
72
2
0
21 Mar 2022
Private Non-Convex Federated Learning Without a Trusted Server
Private Non-Convex Federated Learning Without a Trusted Server
Andrew Lowy
Ali Ghafelebashi
Meisam Razaviyayn
FedML
99
27
0
13 Mar 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property
  of Assembling Weak Models
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Chaoyue Liu
Libin Zhu
M. Belkin
53
4
0
10 Mar 2022
Federated Minimax Optimization: Improved Convergence Analyses and
  Algorithms
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
Pranay Sharma
Rohan Panda
Gauri Joshi
P. Varshney
FedML
114
49
0
09 Mar 2022
From Optimization Dynamics to Generalization Bounds via Łojasiewicz
  Gradient Inequality
From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality
Fusheng Liu
Haizhao Yang
Soufiane Hayou
Qianxiao Li
AI4CE
74
2
0
22 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic
  Gradient Descent for Shallow Neural Networks
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
61
3
0
28 Jan 2022
Localization in Ensemble Kalman inversion
Localization in Ensemble Kalman inversion
Xin T. Tong
Matthias Morzfeld
91
21
0
26 Jan 2022
Approximation bounds for norm constrained neural networks with
  applications to regression and GANs
Approximation bounds for norm constrained neural networks with applications to regression and GANs
Yuling Jiao
Yang Wang
Yunfei Yang
85
20
0
24 Jan 2022
Generalization in Supervised Learning Through Riemannian Contraction
Generalization in Supervised Learning Through Riemannian Contraction
L. Kozachkov
Patrick M. Wensing
Jean-Jacques E. Slotine
MLT
91
9
0
17 Jan 2022
Previous
1234
Next