ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 867 papers shown
Title
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for
  Training Neural Networks
Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks
Yi Ren
Shiqian Ma
64
37
0
05 Jun 2019
On the Convergence of SARAH and Beyond
On the Convergence of SARAH and Beyond
Bingcong Li
Meng Ma
G. Giannakis
70
27
0
05 Jun 2019
Approximate Inference Turns Deep Networks into Gaussian Processes
Approximate Inference Turns Deep Networks into Gaussian Processes
Mohammad Emtiyaz Khan
Alexander Immer
Ehsan Abedi
M. Korzepa
UQCVBDL
132
125
0
05 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known
  Earlier to be More Effective at Data Analysis
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
128
34
0
04 Jun 2019
A Generic Acceleration Framework for Stochastic Composite Optimization
A Generic Acceleration Framework for Stochastic Composite Optimization
A. Kulunchakov
Julien Mairal
105
43
0
03 Jun 2019
Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed
  SR1
Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1
Majid Jahani
M. Nazari
S. Rusakov
A. Berahas
Martin Takávc
75
14
0
30 May 2019
Limitations of the Empirical Fisher Approximation for Natural Gradient
  Descent
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
Lukas Balles
Philipp Hennig
101
219
0
29 May 2019
An Inertial Newton Algorithm for Deep Learning
An Inertial Newton Algorithm for Deep Learning
Camille Castera
Jérôme Bolte
Cédric Févotte
Edouard Pauwels
PINNODL
117
64
0
29 May 2019
Sample Complexity of Sample Average Approximation for Conditional
  Stochastic Optimization
Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization
Yifan Hu
Xin Chen
Niao He
91
36
0
28 May 2019
Recursive Estimation for Sparse Gaussian Process Regression
Recursive Estimation for Sparse Gaussian Process Regression
Manuel Schürch
Dario Azzimonti
A. Benavoli
Marco Zaffalon
65
33
0
28 May 2019
Finite-Sample Analysis of Nonlinear Stochastic Approximation with
  Applications in Reinforcement Learning
Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement Learning
Zaiwei Chen
Sheng Zhang
Thinh T. Doan
John-Paul Clarke
S. T. Maguluri
146
59
0
27 May 2019
Robustness of accelerated first-order algorithms for strongly convex
  optimization problems
Robustness of accelerated first-order algorithms for strongly convex optimization problems
Hesameddin Mohammadi
Meisam Razaviyayn
M. Jovanović
33
41
0
27 May 2019
Decentralized Bayesian Learning over Graphs
Decentralized Bayesian Learning over Graphs
Anusha Lalitha
Xinghan Wang
O. Kilinc
Y. Lu
T. Javidi
F. Koushanfar
FedML
73
25
0
24 May 2019
Leader Stochastic Gradient Descent for Distributed Training of Deep
  Learning Models: Extension
Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension
Yunfei Teng
Wenbo Gao
F. Chalus
A. Choromańska
Shiqian Ma
Adrian Weller
142
12
0
24 May 2019
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition
  Sampling
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling
Jianyu Wang
Anit Kumar Sahu
Zhouyi Yang
Gauri Joshi
S. Kar
96
163
0
23 May 2019
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and
  Communication-Efficient Distributed Learning
LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning
Jingjing Zhang
Osvaldo Simeone
72
32
0
22 May 2019
Client-Edge-Cloud Hierarchical Federated Learning
Client-Edge-Cloud Hierarchical Federated Learning
Lumin Liu
Jun Zhang
S. H. Song
Khaled B. Letaief
FedML
93
758
0
16 May 2019
A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex
  Optimization
A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization
Jia Bi
S. Gunn
61
3
0
13 May 2019
Budgeted Training: Rethinking Deep Neural Network Training Under
  Resource Constraints
Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints
Mengtian Li
Ersin Yumer
Deva Ramanan
72
49
0
12 May 2019
On the Computation and Communication Complexity of Parallel SGD with
  Dynamic Batch Sizes for Stochastic Non-Convex Optimization
On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization
Hao Yu
Rong Jin
81
51
0
10 May 2019
Sparse multiresolution representations with adaptive kernels
Sparse multiresolution representations with adaptive kernels
Maria Peifer
Luiz F. O. Chamon
Santiago Paternain
Alejandro Ribeiro
48
4
0
07 May 2019
An Adaptive Remote Stochastic Gradient Method for Training Neural
  Networks
An Adaptive Remote Stochastic Gradient Method for Training Neural Networks
Yushu Chen
Hao Jing
Wenlai Zhao
Zhiqiang Liu
Haohuan Fu
Lián Qiao
Wei Xue
Guangwen Yang
ODL
92
2
0
04 May 2019
Target-Based Temporal Difference Learning
Target-Based Temporal Difference Learning
Donghwan Lee
Niao He
OOD
83
32
0
24 Apr 2019
Least Squares Auto-Tuning
Least Squares Auto-Tuning
Shane T. Barratt
Stephen P. Boyd
MoMe
82
23
0
10 Apr 2019
Generalizing from a Few Examples: A Survey on Few-Shot Learning
Generalizing from a Few Examples: A Survey on Few-Shot Learning
Yaqing Wang
Quanming Yao
James T. Kwok
L. Ni
147
1,849
0
10 Apr 2019
Convergence rates for the stochastic gradient descent method for
  non-convex objective functions
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
98
101
0
02 Apr 2019
Convergence rates for optimised adaptive importance samplers
Convergence rates for optimised adaptive importance samplers
Ömer Deniz Akyildiz
Joaquín Míguez
133
31
0
28 Mar 2019
OverSketched Newton: Fast Convex Optimization for Serverless Systems
OverSketched Newton: Fast Convex Optimization for Serverless Systems
Vipul Gupta
S. Kadhe
T. Courtade
Michael W. Mahoney
Kannan Ramchandran
85
33
0
21 Mar 2019
Noisy Accelerated Power Method for Eigenproblems with Applications
Noisy Accelerated Power Method for Eigenproblems with Applications
Vien V. Mai
M. Johansson
30
3
0
20 Mar 2019
TATi-Thermodynamic Analytics ToolkIt: TensorFlow-based software for
  posterior sampling in machine learning applications
TATi-Thermodynamic Analytics ToolkIt: TensorFlow-based software for posterior sampling in machine learning applications
Frederik Heber
Zofia Trstanova
Benedict Leimkuhler
40
1
0
20 Mar 2019
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
Fan Zhou
Guojing Cong
65
8
0
12 Mar 2019
SGD without Replacement: Sharper Rates for General Smooth Convex
  Functions
SGD without Replacement: Sharper Rates for General Smooth Convex Functions
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
90
87
0
04 Mar 2019
Time-Delay Momentum: A Regularization Perspective on the Convergence and Generalization of Stochastic Momentum for Deep Learning
Ziming Zhang
Wenju Xu
Alan Sullivan
99
1
0
02 Mar 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
73
22
0
21 Feb 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
103
66
0
19 Feb 2019
ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite
  Nonconvex Optimization
ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization
Nhan H. Pham
Lam M. Nguyen
Dzung Phan
Quoc Tran-Dinh
80
141
0
15 Feb 2019
Forward-backward-forward methods with variance reduction for stochastic
  variational inequalities
Forward-backward-forward methods with variance reduction for stochastic variational inequalities
R. Boț
P. Mertikopoulos
Mathias Staudigl
P. Vuong
68
23
0
09 Feb 2019
Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of
  Neural Networks
Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks
P. Parpas
Corey Muir
OOD
88
12
0
07 Feb 2019
Negative eigenvalues of the Hessian in deep neural networks
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Nicolas Le Roux
Pierre-Antoine Manzagol
76
44
0
06 Feb 2019
Riemannian adaptive stochastic gradient algorithms on matrix manifolds
Riemannian adaptive stochastic gradient algorithms on matrix manifolds
Hiroyuki Kasai
Pratik Jawanpuria
Bamdev Mishra
84
3
0
04 Feb 2019
Stochastic first-order methods: non-asymptotic and computer-aided
  analyses via potential functions
Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions
Adrien B. Taylor
Francis R. Bach
79
64
0
03 Feb 2019
Stochastic Gradient Descent for Nonconvex Learning without Bounded
  Gradient Assumptions
Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Yunwen Lei
Ting Hu
Guiying Li
K. Tang
MLT
95
119
0
03 Feb 2019
Non-asymptotic Analysis of Biased Stochastic Approximation Scheme
Non-asymptotic Analysis of Biased Stochastic Approximation Scheme
Belhal Karimi
B. Miasojedow
Eric Moulines
Hoi-To Wai
100
91
0
02 Feb 2019
Variational Characterizations of Local Entropy and Heat Regularization
  in Deep Learning
Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning
Nicolas García Trillos
Zachary T. Kaplan
D. Sanz-Alonso
ODL
57
3
0
29 Jan 2019
Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
A. Berahas
Majid Jahani
Peter Richtárik
Martin Takávc
107
41
0
28 Jan 2019
SGD: General Analysis and Improved Rates
SGD: General Analysis and Improved Rates
Robert Mansel Gower
Nicolas Loizou
Xun Qian
Alibek Sailanbayev
Egor Shulgin
Peter Richtárik
109
383
0
27 Jan 2019
Estimate Sequences for Stochastic Composite Optimization: Variance
  Reduction, Acceleration, and Robustness to Noise
Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise
A. Kulunchakov
Julien Mairal
90
45
0
25 Jan 2019
Provable Smoothness Guarantees for Black-Box Variational Inference
Provable Smoothness Guarantees for Black-Box Variational Inference
Justin Domke
74
36
0
24 Jan 2019
Large-Batch Training for LSTM and Beyond
Large-Batch Training for LSTM and Beyond
Yang You
Jonathan Hseu
Chris Ying
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
65
91
0
24 Jan 2019
Decoupled Greedy Learning of CNNs
Decoupled Greedy Learning of CNNs
Eugene Belilovsky
Michael Eickenberg
Edouard Oyallon
80
117
0
23 Jan 2019
Previous
123...131415161718
Next