ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.06251
  4. Cited By
Stochastic modified equations and adaptive stochastic gradient
  algorithms

Stochastic modified equations and adaptive stochastic gradient algorithms

19 November 2015
Qianxiao Li
Cheng Tai
E. Weinan
ArXivPDFHTML

Papers citing "Stochastic modified equations and adaptive stochastic gradient algorithms"

50 / 61 papers shown
Title
Approximation to Deep Q-Network by Stochastic Delay Differential Equations
Approximation to Deep Q-Network by Stochastic Delay Differential Equations
Jianya Lu
Yingjun Mo
33
0
0
01 May 2025
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
32
0
0
17 Jun 2024
A General Continuous-Time Formulation of Stochastic ADMM and Its
  Variants
A General Continuous-Time Formulation of Stochastic ADMM and Its Variants
Chris Junchi Li
31
0
0
22 Apr 2024
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Nimit Rana
40
0
0
02 Feb 2024
Learning Rate Schedules in the Presence of Distribution Shift
Learning Rate Schedules in the Presence of Distribution Shift
Matthew Fahrbach
Adel Javanmard
Vahab Mirrokni
Pratik Worah
21
6
0
27 Mar 2023
Revisiting the Noise Model of Stochastic Gradient Descent
Revisiting the Noise Model of Stochastic Gradient Descent
Barak Battash
Ofir Lindenbaum
27
9
0
05 Mar 2023
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic
  Gradient Descent
Stochastic Modified Flows, Mean-Field Limits and Dynamics of Stochastic Gradient Descent
Benjamin Gess
Sebastian Kassing
Vitalii Konarovskyi
DiffM
32
6
0
14 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
16
6
0
03 Feb 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic
  gradient descent
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
53
20
0
02 Feb 2023
Privacy Risk for anisotropic Langevin dynamics using relative entropy
  bounds
Privacy Risk for anisotropic Langevin dynamics using relative entropy bounds
Anastasia Borovykh
N. Kantas
P. Parpas
G. Pavliotis
19
1
0
01 Feb 2023
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning
Antonio Sclocchi
Mario Geiger
M. Wyart
40
6
0
31 Jan 2023
An SDE for Modeling SAM: Theory and Insights
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
23
13
0
19 Jan 2023
Training trajectories, mini-batch losses and the curious role of the
  learning rate
Training trajectories, mini-batch losses and the curious role of the learning rate
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Nolan Miller
ODL
22
10
0
05 Jan 2023
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of
  SGD via Training Trajectories and via Terminal States
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
27
10
0
19 Nov 2022
Toward Equation of Motion for Deep Neural Networks: Continuous-time
  Gradient Descent and Discretization Error Analysis
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis
Taiki Miyagawa
50
9
0
28 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
40
49
0
25 Oct 2022
On uniform-in-time diffusion approximation for stochastic gradient
  descent
On uniform-in-time diffusion approximation for stochastic gradient descent
Lei Li
Yuliang Wang
48
3
0
11 Jul 2022
Trajectory-dependent Generalization Bounds for Deep Neural Networks via
  Fractional Brownian Motion
Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion
Chengli Tan
Jiang Zhang
Junmin Liu
35
1
0
09 Jun 2022
Optimal learning rate schedules in high-dimensional non-convex
  optimization problems
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
16
7
0
09 Feb 2022
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
A Continuous-time Stochastic Gradient Descent Method for Continuous Data
Kexin Jin
J. Latz
Chenguang Liu
Carola-Bibiane Schönlieb
18
9
0
07 Dec 2021
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective
On Large Batch Training and Sharp Minima: A Fokker-Planck Perspective
Xiaowu Dai
Yuhua Zhu
25
4
0
02 Dec 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic
  Differential Equations
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
32
7
0
11 Oct 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
Mixing between the Cross Entropy and the Expectation Loss Terms
Mixing between the Cross Entropy and the Expectation Loss Terms
Barak Battash
Lior Wolf
Tamir Hazan
UQCV
20
0
0
12 Sep 2021
Shift-Curvature, SGD, and Generalization
Shift-Curvature, SGD, and Generalization
Arwen V. Bradley
C. Gomez-Uribe
Manish Reddy Vuyyuru
32
2
0
21 Aug 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
31
15
0
19 Jul 2021
Stochastic gradient descent with noise of machine learning type. Part I:
  Discrete time analysis
Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis
Stephan Wojtowytsch
23
50
0
04 May 2021
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to
  Improve Generalization
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
Zeke Xie
Li-xin Yuan
Zhanxing Zhu
Masashi Sugiyama
19
29
0
31 Mar 2021
On the Validity of Modeling SGD with Stochastic Differential Equations
  (SDEs)
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)
Zhiyuan Li
Sadhika Malladi
Sanjeev Arora
38
78
0
24 Feb 2021
Estimating informativeness of samples with Smooth Unique Information
Estimating informativeness of samples with Smooth Unique Information
Hrayr Harutyunyan
Alessandro Achille
Giovanni Paolini
Orchid Majumder
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
27
24
0
17 Jan 2021
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient
  Descent
Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Kangqiao Liu
Liu Ziyin
Masakuni Ueda
MLT
61
37
0
07 Dec 2020
Why resampling outperforms reweighting for correcting sampling bias with
  stochastic gradients
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients
Jing An
Lexing Ying
Yuhua Zhu
26
38
0
28 Sep 2020
Improved generalization by noise enhancement
Improved generalization by noise enhancement
Takashi Mori
Masahito Ueda
13
3
0
28 Sep 2020
Obtaining Adjustable Regularization for Free via Iterate Averaging
Obtaining Adjustable Regularization for Free via Iterate Averaging
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
30
2
0
15 Aug 2020
Adversarial Training Reduces Information and Improves Transferability
Adversarial Training Reduces Information and Improves Transferability
M. Terzi
Alessandro Achille
Marco Maggipinto
Gian Antonio Susto
AAML
19
23
0
22 Jul 2020
A Differential Game Theoretic Neural Optimizer for Training Residual
  Networks
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
24
2
0
17 Jul 2020
On stochastic mirror descent with interacting particles: convergence
  properties and variance reduction
On stochastic mirror descent with interacting particles: convergence properties and variance reduction
Anastasia Borovykh
N. Kantas
P. Parpas
G. Pavliotis
28
12
0
15 Jul 2020
Free-rider Attacks on Model Aggregation in Federated Learning
Free-rider Attacks on Model Aggregation in Federated Learning
Yann Fraboni
Richard Vidal
Marco Lorenzi
FedML
6
124
0
21 Jun 2020
Machine Learning and Control Theory
Machine Learning and Control Theory
A. Bensoussan
Yiqun Li
Dinh Phan Cao Nguyen
M. Tran
S. Yam
Xiang Zhou
AI4CE
24
12
0
10 Jun 2020
Stochastic Modified Equations for Continuous Limit of Stochastic ADMM
Stochastic Modified Equations for Continuous Limit of Stochastic ADMM
Xiang Zhou
Huizhuo Yuan
C. J. Li
Qingyun Sun
28
6
0
07 Mar 2020
Non-Gaussianity of Stochastic Gradient Noise
Non-Gaussianity of Stochastic Gradient Noise
A. Panigrahi
Raghav Somani
Navin Goyal
Praneeth Netrapalli
15
52
0
21 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
53
70
0
09 Oct 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
37
31
0
28 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
24
10
0
18 Jun 2019
Continuous Time Analysis of Momentum Methods
Continuous Time Analysis of Momentum Methods
Nikola B. Kovachki
Andrew M. Stuart
15
32
0
10 Jun 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural
  Networks
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
Umut Simsekli
Levent Sagun
Mert Gurbuzbalaban
17
237
0
18 Jan 2019
Towards Theoretical Understanding of Large Batch Training in Stochastic
  Gradient Descent
Towards Theoretical Understanding of Large Batch Training in Stochastic Gradient Descent
Xiaowu Dai
Yuhua Zhu
25
11
0
03 Dec 2018
Quasi-hyperbolic momentum and Adam for deep learning
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
84
129
0
16 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms
Continuous-time Models for Stochastic Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
13
31
0
05 Oct 2018
12
Next