Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,407 papers shown
Title
DTN: A Learning Rate Scheme with Convergence Rate of
O
(
1
/
t
)
\mathcal{O}(1/t)
O
(
1/
t
)
for SGD
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
Jayant Kalagnanam
Marten van Dijk
33
0
0
22 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks
Jinrong Guo
Wantao Liu
Wang Wang
Q. Lu
Songlin Hu
Jizhong Han
Ruixuan Li
16
9
0
21 Jan 2019
Tuning parameter selection rules for nuclear norm regularized multivariate linear regression
Pan Shang
Lingchen Kong
23
1
0
19 Jan 2019
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
Sattar Vakili
Sudeep Salgia
Qing Zhao
17
7
0
17 Jan 2019
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization
Xiao Fu
Shahana Ibrahim
Hoi-To Wai
Cheng Gao
Kejun Huang
17
37
0
16 Jan 2019
Optimization Problems for Machine Learning: A Survey
Claudio Gambella
Bissan Ghaddar
Joe Naoum-Sawaya
AI4CE
30
178
0
16 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Luo Mai
Paolo Costa
Peter R. Pietzuch
13
69
0
08 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Yi Zhou
Junjie Yang
Huishuai Zhang
Yingbin Liang
Vahid Tarokh
14
71
0
02 Jan 2019
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis
S. Fattahi
Somayeh Sojoudi
14
38
0
30 Dec 2018
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
46
806
0
19 Dec 2018
A stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs
R. Kannan
James R. Luedtke
8
4
0
17 Dec 2018
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
13
268
0
14 Dec 2018
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
30
228
0
12 Dec 2018
Layer-Parallel Training of Deep Residual Neural Networks
Stefanie Günther
Lars Ruthotto
J. Schroder
E. Cyr
N. Gauger
17
90
0
11 Dec 2018
A probabilistic incremental proximal gradient method
Ömer Deniz Akyildiz
Émilie Chouzenoux
Victor Elvira
Joaquín Míguez
8
3
0
04 Dec 2018
Image-based model parameter optimization using Model-Assisted Generative Adversarial Networks
Saúl Alonso-Monsalve
L. Whitehead
GAN
14
30
0
30 Nov 2018
Universal Adversarial Training
A. Mendrik
Mahyar Najibi
Zheng Xu
John P. Dickerson
L. Davis
Tom Goldstein
AAML
OOD
16
189
0
27 Nov 2018
Forward Stability of ResNet and Its Variants
Linan Zhang
Hayden Schaeffer
30
47
0
24 Nov 2018
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
Ömer Deniz Akyildiz
Dan Crisan
Joaquín Míguez
15
5
0
23 Nov 2018
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
33
364
0
23 Nov 2018
Distributed Gradient Descent with Coded Partial Gradient Computations
Emre Ozfatura
S. Ulukus
Deniz Gunduz
19
40
0
22 Nov 2018
New Convergence Aspects of Stochastic Gradient Algorithms
Lam M. Nguyen
Phuong Ha Nguyen
Peter Richtárik
K. Scheinberg
Martin Takáč
Marten van Dijk
23
66
0
10 Nov 2018
A Bayesian Perspective of Statistical Machine Learning for Big Data
R. Sambasivan
Sourish Das
S. Sahu
BDL
GP
14
19
0
09 Nov 2018
Double Adaptive Stochastic Gradient Optimization
Rajaditya Mukherjee
Jin Li
Shicheng Chu
Huamin Wang
ODL
24
0
0
06 Nov 2018
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent
Avetik G. Karagulyan
11
1
0
02 Nov 2018
Functional Nonlinear Sparse Models
Luiz F. O. Chamon
Yonina C. Eldar
Alejandro Ribeiro
11
11
0
01 Nov 2018
A general system of differential equations to model first order adaptive algorithms
André Belotto da Silva
Maxime Gazeau
11
33
0
31 Oct 2018
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization
James Vuckovic
ODL
16
15
0
29 Oct 2018
SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms
Zhe Wang
Kaiyi Ji
Yi Zhou
Yingbin Liang
Vahid Tarokh
ODL
35
81
0
25 Oct 2018
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
R. Freund
Paul Grigas
Rahul Mazumder
20
10
0
20 Oct 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD
Jianyu Wang
Gauri Joshi
FedML
33
231
0
19 Oct 2018
First-order and second-order variants of the gradient descent in a unified framework
Thomas Pierrot
Nicolas Perrin
Olivier Sigaud
ODL
30
7
0
18 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
26
41
0
17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Xiaodong Cui
Wei Zhang
Zoltán Tüske
M. Picheny
ODL
16
89
0
16 Oct 2018
Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks
Zhibin Liao
Tom Drummond
Ian Reid
G. Carneiro
14
21
0
16 Oct 2018
Deep Reinforcement Learning
Yuxi Li
VLM
OffRL
28
144
0
15 Oct 2018
Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD
Phuong Ha Nguyen
Lam M. Nguyen
Marten van Dijk
LRM
12
31
0
10 Oct 2018
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Marten van Dijk
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
36
6
0
09 Oct 2018
Information Geometry of Orthogonal Initializations and Training
Piotr A. Sokól
Il-Su Park
AI4CE
77
16
0
09 Oct 2018
Principled Deep Neural Network Training through Linear Programming
D. Bienstock
Gonzalo Muñoz
Sebastian Pokutta
35
24
0
07 Oct 2018
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Jingchang Liu
Linli Xu
19
2
0
07 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
19
31
0
05 Oct 2018
Combining Natural Gradient with Hessian Free Methods for Sequence Training
Adnan Haider
P. Woodland
ODL
20
4
0
03 Oct 2018
Large batch size training of neural networks with adversarial training and second-order information
Z. Yao
A. Gholami
Daiyaan Arfeen
Richard Liaw
Joseph E. Gonzalez
Kurt Keutzer
Michael W. Mahoney
ODL
6
42
0
02 Oct 2018
Privacy-preserving Stochastic Gradual Learning
Bo Han
Ivor W. Tsang
Xiaokui Xiao
Ling-Hao Chen
S. Fung
C. Yu
NoLa
8
8
0
30 Sep 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
12
25
0
30 Sep 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation
A. Wills
Carl Jidling
Thomas B. Schon
ODL
28
7
0
29 Sep 2018
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent
Yongqiang Cai
Qianxiao Li
Zuowei Shen
14
3
0
29 Sep 2018
Fluctuation-dissipation relations for stochastic gradient descent
Sho Yaida
32
73
0
28 Sep 2018
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Yuejie Chi
Yue M. Lu
Yuxin Chen
39
416
0
25 Sep 2018
Previous
1
2
3
...
24
25
26
27
28
29
Next