v1v2 (latest)

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks

29 February 2020

Papers citing "Loss landscapes and optimization in over-parameterized non-linear systems and neural networks"

50 / 168 papers shown

Title
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition Chen Fan Christos Thrampoulidis Mark Schmidt 58 2 0 02 Apr 2023
Unified analysis of SGD-type methods Eduard A. Gorbunov 76 2 0 29 Mar 2023
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems Sihan Zeng Thinh T. Doan Justin Romberg OffRL 60 3 0 23 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks Huanran Chen Yichi Zhang Yinpeng Dong Xiao Yang Hang Su Junyi Zhu AAML 111 70 0 16 Mar 2023
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss Pierre Bréchet Katerina Papagiannouli Jing An Guido Montúfar 94 4 0 06 Mar 2023
Full Stack Optimization of Transformer Inference: a Survey Sehoon Kim Coleman Hooper Thanakul Wattanawong Minwoo Kang Ruohan Yan ... Qijing Huang Kurt Keutzer Michael W. Mahoney Y. Shao A. Gholami MQ 163 106 0 27 Feb 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width Hossein Taheri Christos Thrampoulidis 105 16 0 18 Feb 2023
Data efficiency and extrapolation trends in neural network interatomic potentials Joshua A Vita Daniel Schwalbe-Koda 73 17 0 12 Feb 2023
On the Convergence of Federated Averaging with Cyclic Client Participation Yae Jee Cho Pranay Sharma Gauri Joshi Zheng Xu Satyen Kale Tong Zhang FedML 108 33 0 06 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models Michael Arbel Romain Menegaux Pierre Wolinski AI4CE 96 6 0 06 Feb 2023
On the Convergence of the Gradient Descent Method with Stochastic Fixed-point Rounding Errors under the Polyak-Lojasiewicz Inequality Lu Xia M. Hochstenbach Stefano Massei 102 2 0 23 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients David A. R. Robin Kevin Scaman Marc Lelarge 60 3 0 19 Jan 2023
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis Le‐Yu Chen Jing Xu J.N. Zhang 106 14 0 02 Jan 2023
Bayesian Interpolation with Deep Linear Networks Boris Hanin Alexander Zlokapa 151 26 0 29 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points Mayank Baranwal Param Budhraja V. Raj A. Hota 67 3 0 07 Dec 2022
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems Zi Xu Ziqi Wang Junlin Wang Y. Dai 106 11 0 24 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair Keller Jordan Hanie Sedghi O. Saukh R. Entezari Behnam Neyshabur MoMe 135 101 0 15 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks Zhichao Wang A. Engel Anand D. Sarwate Ioana Dumitriu Tony Chiang 116 18 0 11 Nov 2022
Neural PDE Solvers for Irregular Domains Biswajit Khara Ethan Herron Zhanhong Jiang Aditya Balu Chih-Hsuan Yang ... Anushrut Jignasu Soumik Sarkar Chinmay Hegde A. Krishnamurthy Baskar Ganapathysubramanian AI4CE 52 9 0 07 Nov 2022
Flatter, faster: scaling momentum for optimal speedup of SGD Aditya Cowsik T. Can Paolo Glorioso 100 5 0 28 Oct 2022
Optimization for Amortized Inverse Problems Tianci Liu Tong Yang Quan Zhang Qi Lei 84 6 0 25 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 74 2 0 10 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth Activations A. Banerjee Pedro Cisneros-Velarde Libin Zhu M. Belkin 73 8 0 29 Sep 2022
Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability Peisong Wen Qianqian Xu Zhiyong Yang Yuan He Qingming Huang 136 10 0 27 Sep 2022
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold Can Yaras Peng Wang Zhihui Zhu Laura Balzano Qing Qu 66 44 0 19 Sep 2022
BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach Mao Ye B. Liu S. Wright Peter Stone Qian Liu 118 90 0 19 Sep 2022
Asymptotic Statistical Analysis of $f$ -divergence GAN Xinwei Shen Kani Chen Tong Zhang 60 2 0 14 Sep 2022
Optimizing the Performative Risk under Weak Convexity Assumptions Yulai Zhao 76 5 0 02 Sep 2022
On the generalization of learning algorithms that do not converge N. Chandramoorthy Andreas Loukas Khashayar Gatmiry Stefanie Jegelka MLT 97 11 0 16 Aug 2022
A Theoretical Analysis of the Learning Dynamics under Class Imbalance Emanuele Francazi Marco Baity-Jesi Aurelien Lucchi 103 18 0 01 Jul 2022
A note on Linear Bottleneck networks and their Transition to Multilinearity Libin Zhu Parthe Pandit M. Belkin MLT 83 0 0 30 Jun 2022
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks G. Farhani Alexander Kazachek Boyu Wang 102 6 0 29 Jun 2022
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out Jun-Kun Wang Chi-Heng Lin Andre Wibisono Bin Hu 84 22 0 22 Jun 2022
PRANC: Pseudo RAndom Networks for Compacting deep models Parsa Nooralinejad Ali Abbasi Soroush Abbasi Koohpayegani Kossar Pourahmadi Meibodi Rana Muhammad Shahroz Khan Soheil Kolouri Hamed Pirsiavash DD 99 0 0 16 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 84 61 0 02 Jun 2022
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top Eduard A. Gorbunov Samuel Horváth Peter Richtárik Gauthier Gidel AAML 49 0 0 01 Jun 2022
A Framework for Overparameterized Learning Dávid Terjék Diego González-Sánchez MLT 50 1 0 26 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 62 4 0 24 May 2022
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD Konstantinos E. Nikolakakis Farzin Haddadpour Amin Karbasi Dionysios S. Kalogerias 140 19 0 26 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 96 13 0 22 Apr 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 77 22 0 30 Mar 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima Tae-Eon Ko Xiantao Li 72 2 0 21 Mar 2022
Private Non-Convex Federated Learning Without a Trusted Server Andrew Lowy Ali Ghafelebashi Meisam Razaviyayn FedML 99 27 0 13 Mar 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models Chaoyue Liu Libin Zhu M. Belkin 53 4 0 10 Mar 2022
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms Pranay Sharma Rohan Panda Gauri Joshi P. Varshney FedML 114 49 0 09 Mar 2022
From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality Fusheng Liu Haizhao Yang Soufiane Hayou Qianxiao Li AI4CE 74 2 0 22 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 61 3 0 28 Jan 2022
Localization in Ensemble Kalman inversion Xin T. Tong Matthias Morzfeld 91 21 0 26 Jan 2022
Approximation bounds for norm constrained neural networks with applications to regression and GANs Yuling Jiao Yang Wang Yunfei Yang 85 20 0 24 Jan 2022
Generalization in Supervised Learning Through Riemannian Contraction L. Kozachkov Patrick M. Wensing Jean-Jacques E. Slotine MLT 91 9 0 17 Jan 2022