ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,407 papers shown
Title
Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
Fu Wei
Chenglong Bao
Yang Liu
35
19
0
04 Oct 2021
Inexact bilevel stochastic gradient methods for constrained and
  unconstrained lower-level problems
Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems
Tommaso Giovannelli
G. Kent
Luis Nunes Vicente
33
12
0
01 Oct 2021
slimTrain -- A Stochastic Approximation Method for Training Separable
  Deep Neural Networks
slimTrain -- A Stochastic Approximation Method for Training Separable Deep Neural Networks
Elizabeth Newman
Julianne Chung
Matthias Chung
Lars Ruthotto
52
6
0
28 Sep 2021
An Accelerated Stochastic Gradient for Canonical Polyadic Decomposition
An Accelerated Stochastic Gradient for Canonical Polyadic Decomposition
Ioanna Siaminou
A. Liavas
25
4
0
28 Sep 2021
Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic
  Optimization
Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic Optimization
Raghu Bollapragada
Stefan M. Wild
37
11
0
24 Sep 2021
Inequality Constrained Stochastic Nonlinear Optimization via Active-Set
  Sequential Quadratic Programming
Inequality Constrained Stochastic Nonlinear Optimization via Active-Set Sequential Quadratic Programming
Sen Na
M. Anitescu
Mladen Kolar
34
33
0
23 Sep 2021
AdaLoss: A computationally-efficient and provably convergent adaptive
  gradient method
AdaLoss: A computationally-efficient and provably convergent adaptive gradient method
Xiaoxia Wu
Yuege Xie
S. Du
Rachel A. Ward
ODL
27
7
0
17 Sep 2021
Non-Asymptotic Analysis of Stochastic Approximation Algorithms for
  Streaming Data
Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Streaming Data
Antoine Godichon-Baggioni
Nicklas Werge
Olivier Wintenberger
25
7
0
15 Sep 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
33
6
0
13 Sep 2021
Byzantine-robust Federated Learning through Collaborative Malicious
  Gradient Filtering
Byzantine-robust Federated Learning through Collaborative Malicious Gradient Filtering
Jian Xu
Shao-Lun Huang
Linqi Song
Tian-Shing Lan
FedML
AAML
39
43
0
13 Sep 2021
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order
  Information
Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information
Majid Jahani
S. Rusakov
Zheng Shi
Peter Richtárik
Michael W. Mahoney
Martin Takávc
ODL
24
25
0
11 Sep 2021
Self-adaptive deep neural network: Numerical approximation to functions
  and PDEs
Self-adaptive deep neural network: Numerical approximation to functions and PDEs
Zhiqiang Cai
Jingshuang Chen
Min Liu
ODL
25
14
0
07 Sep 2021
Multi-agent Natural Actor-critic Reinforcement Learning Algorithms
Multi-agent Natural Actor-critic Reinforcement Learning Algorithms
Prashant Trivedi
N. Hemachandra
29
4
0
03 Sep 2021
Analytic natural gradient updates for Cholesky factor in Gaussian
  variational approximation
Analytic natural gradient updates for Cholesky factor in Gaussian variational approximation
Linda S. L. Tan
25
11
0
01 Sep 2021
Quantized Convolutional Neural Networks Through the Lens of Partial
  Differential Equations
Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations
Ido Ben-Yair
Gil Ben Shalom
Moshe Eliasof
Eran Treister
MQ
36
5
0
31 Aug 2021
Approximate Bayesian Optimisation for Neural Networks
Approximate Bayesian Optimisation for Neural Networks
N. Hassen
Irina Rish
11
1
0
27 Aug 2021
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning
  Optimizer is a Rational Function of Batch Size
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Hideaki Iiduka
26
2
0
26 Aug 2021
Adaptive shot allocation for fast convergence in variational quantum
  algorithms
Adaptive shot allocation for fast convergence in variational quantum algorithms
Andi Gu
Angus Lowe
Pavel A. Dub
Patrick J. Coles
A. Arrasmith
25
22
0
23 Aug 2021
Anarchic Federated Learning
Anarchic Federated Learning
Haibo Yang
Xin Zhang
Prashant Khanduri
Jia Liu
FedML
24
58
0
23 Aug 2021
Mobility-Aware Cluster Federated Learning in Hierarchical Wireless
  Networks
Mobility-Aware Cluster Federated Learning in Hierarchical Wireless Networks
Chenyuan Feng
Heng Yang
Deshun Hu
Zhiwei Zhao
Tony Q.S. Quek
Geyong Min
38
74
0
20 Aug 2021
Cross-Silo Federated Learning for Multi-Tier Networks with Vertical and
  Horizontal Data Partitioning
Cross-Silo Federated Learning for Multi-Tier Networks with Vertical and Horizontal Data Partitioning
Anirban Das
Timothy Castiglia
Shiqiang Wang
S. Patterson
FedML
13
19
0
19 Aug 2021
A proof of convergence for the gradient descent optimization method with
  random initializations in the training of neural networks with ReLU
  activation for piecewise linear target functions
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions
Arnulf Jentzen
Adrian Riekert
38
13
0
10 Aug 2021
On the Hyperparameters in Stochastic Gradient Descent with Momentum
On the Hyperparameters in Stochastic Gradient Descent with Momentum
Bin Shi
14
14
0
09 Aug 2021
Uniform Sampling over Episode Difficulty
Uniform Sampling over Episode Difficulty
Sébastien M. R. Arnold
Guneet Singh Dhillon
Avinash Ravichandran
Stefano Soatto
26
14
0
03 Aug 2021
Numerical Solution of Stiff ODEs with Physics-Informed RPNNs
Numerical Solution of Stiff ODEs with Physics-Informed RPNNs
Evangelos Galaris
Gianluca Fabiani
Francesco Calabrò
D. Serafino
Constantinos Siettos
19
2
0
03 Aug 2021
Coordinate descent on the orthogonal group for recurrent neural network
  training
Coordinate descent on the orthogonal group for recurrent neural network training
E. Massart
V. Abrol
39
10
0
30 Jul 2021
DQ-SGD: Dynamic Quantization in SGD for Communication-Efficient
  Distributed Learning
DQ-SGD: Dynamic Quantization in SGD for Communication-Efficient Distributed Learning
Guangfeng Yan
Shao-Lun Huang
Tian-Shing Lan
Linqi Song
MQ
14
6
0
30 Jul 2021
Decentralized Federated Learning: Balancing Communication and Computing
  Costs
Decentralized Federated Learning: Balancing Communication and Computing Costs
Wei Liu
Li Chen
Wenyi Zhang
FedML
27
106
0
26 Jul 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
82
62
0
23 Jul 2021
Improved Learning Rates for Stochastic Optimization: Two Theoretical
  Viewpoints
Improved Learning Rates for Stochastic Optimization: Two Theoretical Viewpoints
Shaojie Li
Yong Liu
26
13
0
19 Jul 2021
Differentially Private Bayesian Neural Networks on Accuracy, Privacy and
  Reliability
Differentially Private Bayesian Neural Networks on Accuracy, Privacy and Reliability
Qiyiwen Zhang
Zhiqi Bu
Kan Chen
Qi Long
BDL
UQCV
19
11
0
18 Jul 2021
Globally Convergent Multilevel Training of Deep Residual Networks
Globally Convergent Multilevel Training of Deep Residual Networks
Alena Kopanicáková
Rolf Krause
37
15
0
15 Jul 2021
Chimera: Efficiently Training Large-Scale Neural Networks with
  Bidirectional Pipelines
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
80
132
0
14 Jul 2021
Nonlinear Least Squares for Large-Scale Machine Learning using
  Stochastic Jacobian Estimates
Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates
Johannes J Brust
13
2
0
12 Jul 2021
The Bayesian Learning Rule
The Bayesian Learning Rule
Mohammad Emtiyaz Khan
Håvard Rue
BDL
68
73
0
09 Jul 2021
Activated Gradients for Deep Neural Networks
Activated Gradients for Deep Neural Networks
Mei Liu
Liangming Chen
Xiaohao Du
Long Jin
Mingsheng Shang
ODL
AI4CE
35
135
0
09 Jul 2021
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
KOALA: A Kalman Optimization Algorithm with Loss Adaptivity
A. Davtyan
Sepehr Sameni
L. Cerkezi
Givi Meishvili
Adam Bielski
Paolo Favaro
ODL
58
2
0
07 Jul 2021
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural
  Networks
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks
J. G. Pauloski
Qi Huang
Lei Huang
Shivaram Venkataraman
Kyle Chard
Ian Foster
Zhao-jie Zhang
14
28
0
04 Jul 2021
A Comparison of the Delta Method and the Bootstrap in Deep Learning
  Classification
A Comparison of the Delta Method and the Bootstrap in Deep Learning Classification
G. K. Nilsen
A. Munthe-Kaas
H. Skaug
M. Brun
39
0
0
04 Jul 2021
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
  Games: Convergence Analysis under Expected Co-coercivity
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity
Nicolas Loizou
Hugo Berard
Gauthier Gidel
Ioannis Mitliagkas
Simon Lacoste-Julien
29
53
0
30 Jun 2021
Never Go Full Batch (in Stochastic Convex Optimization)
Never Go Full Batch (in Stochastic Convex Optimization)
I Zaghloul Amir
Y. Carmon
Tomer Koren
Roi Livni
45
14
0
29 Jun 2021
The Convergence Rate of SGD's Final Iterate: Analysis on Dimension
  Dependence
The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence
Daogao Liu
Zhou Lu
LRM
32
1
0
28 Jun 2021
A Stochastic Sequential Quadratic Optimization Algorithm for Nonlinear
  Equality Constrained Optimization with Rank-Deficient Jacobians
A Stochastic Sequential Quadratic Optimization Algorithm for Nonlinear Equality Constrained Optimization with Rank-Deficient Jacobians
A. Berahas
Frank E. Curtis
Michael OÑeill
Daniel P. Robinson
26
31
0
24 Jun 2021
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman
  Operators
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
39
11
0
24 Jun 2021
Numerical influence of ReLU'(0) on backpropagation
Numerical influence of ReLU'(0) on backpropagation
David Bertoin
Jérôme Bolte
Sébastien Gerchinovitz
Edouard Pauwels
24
0
0
23 Jun 2021
Solving Stochastic Optimization with Expectation Constraints Efficiently
  by a Stochastic Augmented Lagrangian-Type Algorithm
Solving Stochastic Optimization with Expectation Constraints Efficiently by a Stochastic Augmented Lagrangian-Type Algorithm
Liwei Zhang
Yule Zhang
Jia Wu
X. Xiao
19
12
0
22 Jun 2021
Memory Augmented Optimizers for Deep Learning
Memory Augmented Optimizers for Deep Learning
Paul-Aymeric McRae
Prasanna Parthasarathi
Mahmoud Assran
Sarath Chandar
ODL
30
3
0
20 Jun 2021
STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal
  Sample and Communication Complexities for Federated Learning
STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning
Prashant Khanduri
Pranay Sharma
Haibo Yang
Min-Fong Hong
Jia Liu
K. Rajawat
P. Varshney
FedML
27
63
0
19 Jun 2021
Interval and fuzzy physics-informed neural networks for uncertain fields
Interval and fuzzy physics-informed neural networks for uncertain fields
J. Fuhg
Ioannis Kalogeris
A. Fau
N. Bouklas
AI4CE
46
18
0
18 Jun 2021
Algorithmic Bias and Data Bias: Understanding the Relation between
  Distributionally Robust Optimization and Data Curation
Algorithmic Bias and Data Bias: Understanding the Relation between Distributionally Robust Optimization and Data Curation
Agnieszka Słowik
Léon Bottou
FaML
45
19
0
17 Jun 2021
Previous
123...131415...272829
Next