ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,406 papers shown
Title
Training Feedforward Neural Networks with Standard Logistic Activations
  is Feasible
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible
Emanuele Sansone
F. D. De Natale
24
4
0
03 Oct 2017
How regularization affects the critical points in linear networks
How regularization affects the critical points in linear networks
Amirhossein Taghvaei
Jin-Won Kim
P. Mehta
26
13
0
27 Sep 2017
On Principal Components Regression, Random Projections, and Column
  Subsampling
On Principal Components Regression, Random Projections, and Column Subsampling
M. Slawski
9
20
0
23 Sep 2017
Feedforward and Recurrent Neural Networks Backward Propagation and
  Hessian in Matrix Form
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form
Maxim Naumov
23
9
0
16 Sep 2017
ClickBAIT: Click-based Accelerated Incremental Training of Convolutional
  Neural Networks
ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks
Ervin Teng
João Diogo Falcão
Bob Iannucci
33
14
0
15 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient
  Descent for Nonconvex Problems
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems
V. Patel
MLT
17
8
0
14 Sep 2017
Second-Order Optimization for Non-Convex Machine Learning: An Empirical
  Study
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
ODL
14
143
0
25 Aug 2017
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian
  Information
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
Peng Xu
Farbod Roosta-Khorasani
Michael W. Mahoney
28
210
0
23 Aug 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large
  Learning Rates
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
L. Smith
Nicholay Topin
AI4CE
20
519
0
23 Aug 2017
Regularizing and Optimizing LSTM Language Models
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
60
1,091
0
07 Aug 2017
On the convergence properties of a $K$-step averaging stochastic
  gradient descent algorithm for nonconvex optimization
On the convergence properties of a KKK-step averaging stochastic gradient descent algorithm for nonconvex optimization
Fan Zhou
Guojing Cong
46
232
0
03 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning
A Robust Multi-Batch L-BFGS Method for Machine Learning
A. Berahas
Martin Takáč
AAML
ODL
19
44
0
26 Jul 2017
Warped Riemannian metrics for location-scale models
Warped Riemannian metrics for location-scale models
Salem Said
Lionel Bombrun
Y. Berthoumieu
37
15
0
22 Jul 2017
Stochastic, Distributed and Federated Optimization for Machine Learning
Stochastic, Distributed and Federated Optimization for Machine Learning
Jakub Konecný
FedML
26
38
0
04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models
  to Deep Learning
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
Frank E. Curtis
K. Scheinberg
39
45
0
30 Jun 2017
Efficiency of quantum versus classical annealing in non-convex learning
  problems
Efficiency of quantum versus classical annealing in non-convex learning problems
Carlo Baldassi
R. Zecchina
16
43
0
26 Jun 2017
Faster independent component analysis by preconditioning with Hessian
  approximations
Faster independent component analysis by preconditioning with Hessian approximations
Pierre Ablin
J. Cardoso
Alexandre Gramfort
CML
28
124
0
25 Jun 2017
Collaborative Deep Learning in Fixed Topology Networks
Collaborative Deep Learning in Fixed Topology Networks
Zhanhong Jiang
Aditya Balu
C. Hegde
S. Sarkar
FedML
21
179
0
23 Jun 2017
Improved Optimization of Finite Sums with Minibatch Stochastic Variance
  Reduced Proximal Iterations
Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations
Jialei Wang
Tong Zhang
19
12
0
21 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Dong Yin
A. Pananjady
Max Lam
Dimitris Papailiopoulos
Kannan Ramchandran
Peter L. Bartlett
9
11
0
18 Jun 2017
Stochastic Training of Neural Networks via Successive Convex
  Approximations
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
P. Di Lorenzo
22
9
0
15 Jun 2017
Proximal Backpropagation
Proximal Backpropagation
Thomas Frerix
Thomas Möllenhoff
Michael Möller
Daniel Cremers
23
31
0
14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
22
3,649
0
08 Jun 2017
Diagonal Rescaling For Neural Networks
Diagonal Rescaling For Neural Networks
Jean Lafond
Nicolas Vasilache
Léon Bottou
6
11
0
25 May 2017
Diminishing Batch Normalization
Diminishing Batch Normalization
Yintai Ma
Diego Klabjan
31
15
0
22 May 2017
On the diffusion approximation of nonconvex stochastic gradient descent
On the diffusion approximation of nonconvex stochastic gradient descent
Junyang Qian
C. J. Li
Lei Li
Jianguo Liu
DiffM
23
24
0
22 May 2017
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD
Mehmet A. Donmez
Maxim Raginsky
A. Singer
FedML
9
0
0
19 May 2017
An Investigation of Newton-Sketch and Subsampled Newton Methods
An Investigation of Newton-Sketch and Subsampled Newton Methods
A. Berahas
Raghu Bollapragada
J. Nocedal
19
111
0
17 May 2017
Efficient Parallel Methods for Deep Reinforcement Learning
Efficient Parallel Methods for Deep Reinforcement Learning
Alfredo V. Clemente
Humberto Nicolás Castejón Martínez
A. Chandra
9
114
0
13 May 2017
Stable Architectures for Deep Neural Networks
Stable Architectures for Deep Neural Networks
E. Haber
Lars Ruthotto
23
714
0
09 May 2017
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
Hsiou-Yuan Liu
Dehong Liu
Hassan Mansour
P. Boufounos
Laura Waller
Ulugbek S. Kamilov
9
75
0
05 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer
Artem Sokolov
Stefan Riezler
27
49
0
21 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep
  neural networks
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
27
153
0
17 Apr 2017
Inference via low-dimensional couplings
Inference via low-dimensional couplings
Alessio Spantini
Daniele Bigoni
Youssef Marzouk
38
119
0
17 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
46
757
0
15 Mar 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and
  its convergence analysis
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis
Hiroyuki Kasai
Hiroyuki Sato
Bamdev Mishra
13
22
0
15 Mar 2017
Learning across scales - A multiscale method for Convolution Neural
  Networks
Learning across scales - A multiscale method for Convolution Neural Networks
E. Haber
Lars Ruthotto
E. Holtham
Seong-Hwan Jun
17
23
0
06 Mar 2017
Stochastic Functional Gradient for Motion Planning in Continuous
  Occupancy Maps
Stochastic Functional Gradient for Motion Planning in Continuous Occupancy Maps
Gilad Francis
Lionel Ott
F. Ramos
16
16
0
01 Mar 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic
  Recursive Gradient
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Lam M. Nguyen
Jie Liu
K. Scheinberg
Martin Takáč
ODL
28
597
0
01 Mar 2017
Stochastic Newton and Quasi-Newton Methods for Large Linear
  Least-squares Problems
Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems
Julianne Chung
Matthias Chung
J. T. Slagel
L. Tenorio
27
11
0
23 Feb 2017
On SGD's Failure in Practice: Characterizing and Overcoming Stalling
On SGD's Failure in Practice: Characterizing and Overcoming Stalling
V. Patel
16
1
0
01 Feb 2017
Stochastic Subsampling for Factorizing Huge Matrices
Stochastic Subsampling for Factorizing Huge Matrices
A. Mensch
Julien Mairal
B. Thirion
Gaël Varoquaux
9
30
0
19 Jan 2017
Towards Principled Methods for Training Generative Adversarial Networks
Towards Principled Methods for Training Generative Adversarial Networks
Martín Arjovsky
M. Nault
GAN
27
2,096
0
17 Jan 2017
Stochastic Generative Hashing
Stochastic Generative Hashing
Bo Dai
Ruiqi Guo
Sanjiv Kumar
Niao He
Le Song
TPM
35
106
0
11 Jan 2017
Coupling Adaptive Batch Sizes with Learning Rates
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
Javier Romero
Philipp Hennig
ODL
21
110
0
15 Dec 2016
Federated Optimization: Distributed Machine Learning for On-Device
  Intelligence
Federated Optimization: Distributed Machine Learning for On-Device Intelligence
Jakub Konecný
H. B. McMahan
Daniel Ramage
Peter Richtárik
FedML
60
1,878
0
08 Oct 2016
Stochastic Optimization with Variance Reduction for Infinite Datasets
  with Finite-Sum Structure
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure
A. Bietti
Julien Mairal
44
36
0
04 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Benchmarking State-of-the-Art Deep Learning Software Tools
Benchmarking State-of-the-Art Deep Learning Software Tools
S. Shi
Qiang-qiang Wang
Pengfei Xu
Xiaowen Chu
BDL
14
327
0
25 Aug 2016
DOOMED: Direct Online Optimization of Modeling Errors in Dynamics
DOOMED: Direct Online Optimization of Modeling Errors in Dynamics
Nathan D. Ratliff
Franziska Meier
Daniel Kappler
S. Schaal
17
17
0
01 Aug 2016
Previous
123...272829
Next