ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.00908
  4. Cited By
Stochastic Gradient Descent for Nonconvex Learning without Bounded
  Gradient Assumptions

Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions

3 February 2019
Yunwen Lei
Ting Hu
Guiying Li
K. Tang
    MLT
ArXivPDFHTML

Papers citing "Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions"

32 / 32 papers shown
Title
Observability conditions for neural state-space models with eigenvalues and their roots of unity
Observability conditions for neural state-space models with eigenvalues and their roots of unity
Andrew Gracyk
155
0
0
22 Apr 2025
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent
  Algorithm
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm
Jinwei Zhao
Marco Gori
Alessandro Betti
S. Melacci
Hongtao Zhang
Jiedong Liu
Xinhong Hei
33
0
0
10 Sep 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
39
3
0
22 May 2024
Non-convergence to global minimizers for Adam and stochastic gradient
  descent optimization and constructions of local minimizers in the training of
  artificial neural networks
Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks
Arnulf Jentzen
Adrian Riekert
38
4
0
07 Feb 2024
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
39
15
0
21 May 2023
Distributed Stochastic Optimization under a General Variance Condition
Distributed Stochastic Optimization under a General Variance Condition
Kun-Yen Huang
Xiao Li
Shin-Yi Pu
FedML
40
6
0
30 Jan 2023
FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive
  Bi-directional Global Objective
FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive Bi-directional Global Objective
Ping Luo
Jieren Cheng
Zhenhao Liu
N. Xiong
Jie Wu
FedML
35
1
0
28 Sep 2022
Statistical Guarantees for Approximate Stationary Points of Simple
  Neural Networks
Statistical Guarantees for Approximate Stationary Points of Simple Neural Networks
Mahsa Taheri
Fang Xie
Johannes Lederer
29
0
0
09 May 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in
  Non-Convex Optimization With Non-isolated Local Minima
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
27
2
0
21 Mar 2022
Convergence proof for stochastic gradient descent in the training of
  deep neural networks with ReLU activation for constant target functions
Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions
Martin Hutzenthaler
Arnulf Jentzen
Katharina Pohl
Adrian Riekert
Luca Scarpa
MLT
34
6
0
13 Dec 2021
A proof of convergence for the gradient descent optimization method with
  random initializations in the training of neural networks with ReLU
  activation for piecewise linear target functions
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions
Arnulf Jentzen
Adrian Riekert
33
13
0
10 Aug 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
79
62
0
23 Jul 2021
Improved Learning Rates for Stochastic Optimization: Two Theoretical
  Viewpoints
Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints
Shaojie Li
Yong Liu
26
13
0
19 Jul 2021
Decentralized Federated Averaging
Decentralized Federated Averaging
Tao Sun
Dongsheng Li
Bao Wang
FedML
54
207
0
23 Apr 2021
A proof of convergence for stochastic gradient descent in the training
  of artificial neural networks with ReLU activation for constant target
  functions
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
Arnulf Jentzen
Adrian Riekert
MLT
32
13
0
01 Apr 2021
Spatio-Temporal Neural Network for Fitting and Forecasting COVID-19
Spatio-Temporal Neural Network for Fitting and Forecasting COVID-19
Yi-Shuai Niu
Wentao Ding
Junpeng Hu
Wenxu Xu
S. Canu
22
2
0
22 Mar 2021
Convergence rates for gradient descent in the training of
  overparameterized artificial neural networks with biases
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases
Arnulf Jentzen
T. Kröger
ODL
28
7
0
23 Feb 2021
A proof of convergence for gradient descent in the training of
  artificial neural networks for constant target functions
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions
Patrick Cheridito
Arnulf Jentzen
Adrian Riekert
Florian Rossmannek
28
24
0
19 Feb 2021
Convergence of stochastic gradient descent schemes for
  Lojasiewicz-landscapes
Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes
Steffen Dereich
Sebastian Kassing
34
27
0
16 Feb 2021
A General Family of Stochastic Proximal Gradient Methods for Deep
  Learning
A General Family of Stochastic Proximal Gradient Methods for Deep Learning
Jihun Yun
A. Lozano
Eunho Yang
20
12
0
15 Jul 2020
On the Almost Sure Convergence of Stochastic Gradient Descent in
  Non-Convex Problems
On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
P. Mertikopoulos
Nadav Hallak
Ali Kavis
V. Cevher
24
85
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
25
74
0
18 Jun 2020
Fine-Grained Analysis of Stability and Generalization for Stochastic
  Gradient Descent
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Yunwen Lei
Yiming Ying
MLT
35
126
0
15 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep
  neural networks
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient
  Descent on Bottou-Curtis-Nocedal Functions
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions
V. Patel
21
23
0
01 Apr 2020
Overall error analysis for the training of deep neural networks via
  stochastic gradient descent with random initialisation
Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
Arnulf Jentzen
Timo Welti
22
15
0
03 Mar 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
27
181
0
24 Feb 2020
Better Theory for SGD in the Nonconvex World
Better Theory for SGD in the Nonconvex World
Ahmed Khaled
Peter Richtárik
13
178
0
09 Feb 2020
Solving the Kolmogorov PDE by means of deep learning
Solving the Kolmogorov PDE by means of deep learning
C. Beck
S. Becker
Philipp Grohs
Nor Jaafari
Arnulf Jentzen
19
91
0
01 Jun 2018
Linear Convergence of Gradient and Proximal-Gradient Methods Under the
  Polyak-Łojasiewicz Condition
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
139
1,201
0
16 Aug 2016
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
126
259
0
10 Dec 2012
1