Non-convergence to the optimal risk for Adam and stochastic gradient descent optimization in the training of deep neural networks

3 March 2025

Papers citing "Non-convergence to the optimal risk for Adam and stochastic gradient descent optimization in the training of deep neural networks"

14 / 14 papers shown

Title
Mathematical analysis of the gradients in deep learning Steffen Dereich Thang Do Arnulf Jentzen Frederic Weber MLT 83 1 0 28 Jan 2025
Non-convergence of Adam and other adaptive stochastic gradient descent optimization methods for non-vanishing learning rates Steffen Dereich Robin Graeber Arnulf Jentzen 29 3 0 11 Jul 2024
Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions Martin Hutzenthaler Arnulf Jentzen Katharina Pohl Adrian Riekert Luca Scarpa MLT 98 7 0 13 Dec 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions Arnulf Jentzen Adrian Riekert MLT 76 13 0 01 Apr 2021
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 420 5,005 0 24 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions Patrick Cheridito Arnulf Jentzen Adrian Riekert Florian Rossmannek 57 25 0 19 Feb 2021
An overview on deep learning-based approximation methods for partial differential equations C. Beck Martin Hutzenthaler Arnulf Jentzen Benno Kuckuck 96 153 0 22 Dec 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't E. Weinan Chao Ma Stephan Wojtowytsch Lei Wu AI4CE 112 134 0 22 Sep 2020
Expressivity of Deep Neural Networks Ingo Gühring Mones Raslan Gitta Kutyniok 85 51 0 09 Jul 2020
Approximation with Neural Networks in Variable Lebesgue Spaces Á. Capel J. Ocáriz 19 6 0 08 Jul 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 63 37 0 12 Jun 2020
A mathematical model for automatic differentiation in machine learning Jérôme Bolte Edouard Pauwels 65 68 0 03 Jun 2020
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 904 42,463 0 28 May 2020
Neural networks for option pricing and hedging: a literature review Johannes Ruf Weiguan Wang 73 129 0 13 Nov 2019