Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 866 papers shown
Title
Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large Graphs
Adil Salim
Pascal Bianchi
W. Hachem
65
17
0
19 Dec 2017
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
117
291
0
18 Dec 2017
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Shankar Krishnan
Ying Xiao
Rif A. Saurous
ODL
45
20
0
08 Dec 2017
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Aditya Devarakonda
Maxim Naumov
M. Garland
ODL
112
136
0
06 Dec 2017
A two-dimensional decomposition approach for matrix completion through gossip
Mukul Bhutani
Bamdev Mishra
26
0
0
21 Nov 2017
Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
Ziming Zhang
M. Brand
59
71
0
20 Nov 2017
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Ziming Zhang
Yuanwei Wu
Guanghui Wang
ODL
65
28
0
19 Nov 2017
Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization
Zhouyuan Huo
Bin Gu
Ji Liu
Heng-Chiao Huang
93
51
0
10 Nov 2017
SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
Francisco J. R. Ruiz
Susan Athey
David M. Blei
415
85
0
09 Nov 2017
Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs
Bin Hu
Peter M. Seiler
Laurent Lessard
121
40
0
03 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
133
996
0
01 Nov 2017
Adaptive Sampling Strategies for Stochastic Optimization
Raghu Bollapragada
R. Byrd
J. Nocedal
54
116
0
30 Oct 2017
On the role of synaptic stochasticity in training low-precision neural networks
Carlo Baldassi
Federica Gerace
H. Kappen
Carlo Lucibello
Luca Saglietti
Enzo Tartaglione
R. Zecchina
55
23
0
26 Oct 2017
Avoiding Communication in Proximal Methods for Convex Optimization Problems
Saeed Soori
Aditya Devarakonda
J. Demmel
Mert Gurbuzbalaban
M. Dehnavi
34
7
0
24 Oct 2017
Smart "Predict, then Optimize"
Adam N. Elmachtoub
Paul Grigas
104
613
0
22 Oct 2017
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
Chun Yang
Xu-Cheng Yin
Zejun Li
Jianwei Wu
Chunchao Guo
Hongfa Wang
Lei Xiao
44
10
0
10 Oct 2017
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible
Emanuele Sansone
F. D. De Natale
29
4
0
03 Oct 2017
How regularization affects the critical points in linear networks
Amirhossein Taghvaei
Jin-Won Kim
P. Mehta
77
13
0
27 Sep 2017
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form
Maxim Naumov
82
9
0
16 Sep 2017
ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks
Ervin Teng
João Diogo Falcão
Bob Iannucci
62
14
0
15 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems
V. Patel
MLT
73
8
0
14 Sep 2017
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
ODL
84
145
0
25 Aug 2017
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
133
214
0
23 Aug 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
L. Smith
Nicholay Topin
AI4CE
137
518
0
23 Aug 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
178
1,098
0
07 Aug 2017
On the convergence properties of a
K
K
K
-step averaging stochastic gradient descent algorithm for nonconvex optimization
Fan Zhou
Guojing Cong
186
236
0
03 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning
A. Berahas
Martin Takáč
AAML
ODL
111
44
0
26 Jul 2017
Warped Riemannian metrics for location-scale models
Salem Said
Lionel Bombrun
Y. Berthoumieu
76
15
0
22 Jul 2017
Stochastic, Distributed and Federated Optimization for Machine Learning
Jakub Konecný
FedML
83
38
0
04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
Frank E. Curtis
K. Scheinberg
100
45
0
30 Jun 2017
Efficiency of quantum versus classical annealing in non-convex learning problems
Carlo Baldassi
R. Zecchina
78
45
0
26 Jun 2017
Faster independent component analysis by preconditioning with Hessian approximations
Pierre Ablin
J. Cardoso
Alexandre Gramfort
CML
87
127
0
25 Jun 2017
Collaborative Deep Learning in Fixed Topology Networks
Zhanhong Jiang
Aditya Balu
Chinmay Hegde
Soumik Sarkar
FedML
82
181
0
23 Jun 2017
Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations
Jialei Wang
Tong Zhang
80
12
0
21 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Dong Yin
A. Pananjady
Max Lam
Dimitris Papailiopoulos
Kannan Ramchandran
Peter L. Bartlett
89
11
0
18 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
Paolo Di Lorenzo
43
9
0
15 Jun 2017
Proximal Backpropagation
Thomas Frerix
Thomas Möllenhoff
Michael Möller
Zorah Lähner
66
31
0
14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
226
3,692
0
08 Jun 2017
Diminishing Batch Normalization
Yintai Ma
Diego Klabjan
49
15
0
22 May 2017
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD
Mehmet A. Donmez
Maxim Raginsky
A. Singer
FedML
16
0
0
19 May 2017
An Investigation of Newton-Sketch and Subsampled Newton Methods
A. Berahas
Raghu Bollapragada
J. Nocedal
104
114
0
17 May 2017
Efficient Parallel Methods for Deep Reinforcement Learning
Alfredo V. Clemente
Humberto Nicolás Castejón Martínez
A. Chandra
85
115
0
13 May 2017
Stable Architectures for Deep Neural Networks
E. Haber
Lars Ruthotto
174
736
0
09 May 2017
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
Hsiou-Yuan Liu
Dehong Liu
Hassan Mansour
P. Boufounos
Laura Waller
Ulugbek S. Kamilov
50
77
0
05 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer
Artem Sokolov
Stefan Riezler
85
49
0
21 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
174
154
0
17 Apr 2017
Inference via low-dimensional couplings
Alessio Spantini
Daniele Bigoni
Youssef Marzouk
145
119
0
17 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
147
774
0
15 Mar 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis
Hiroyuki Kasai
Hiroyuki Sato
Bamdev Mishra
65
22
0
15 Mar 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Lam M. Nguyen
Jie Liu
K. Scheinberg
Martin Takáč
ODL
177
608
0
01 Mar 2017
Previous
1
2
3
...
16
17
18
Next