Continuous vs. Discrete Optimization of Deep Neural Networks

14 July 2021

Papers citing "Continuous vs. Discrete Optimization of Deep Neural Networks"

13 / 13 papers shown

Title
The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank Amitay Bar Rotem Mulayoff T. Michaeli Ronen Talmon 64 0 0 21 Feb 2024
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond Itai Kreisler Mor Shpigel Nacson Daniel Soudry Y. Carmon 27 13 0 22 May 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 16 6 0 03 Feb 2023
Symmetries, flat minima, and the conserved quantities of gradient flow Bo-Lu Zhao I. Ganev Robin G. Walters Rose Yu Nima Dehmamy 47 16 0 31 Oct 2022
Perturbation Analysis of Neural Collapse Tom Tirer Haoxiang Huang Jonathan Niles-Weed AAML 35 23 0 29 Oct 2022
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis Taiki Miyagawa 50 9 0 28 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent Satyen Kale Jason D. Lee Chris De Sa Ayush Sekhari Karthik Sridharan 24 4 0 13 Oct 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 56 23 0 18 May 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 33 3 0 28 Jan 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks Noam Razin Asaf Maman Nadav Cohen 46 29 0 27 Jan 2022
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics D. Kunin Javier Sagastuy-Breña Surya Ganguli Daniel L. K. Yamins Hidenori Tanaka 107 77 0 08 Dec 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 234 0 04 Mar 2020
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights Weijie Su Stephen P. Boyd Emmanuel J. Candes 105 1,152 0 04 Mar 2015