Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,407 papers shown
Title
Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection
Elizabeth Newman
Lars Ruthotto
Joseph L. Hart
B. V. B. Waanders
AAML
36
19
0
26 Jul 2020
Online Robust and Adaptive Learning from Data Streams
Shintaro Fukushima
Atsushi Nitanda
Kenji Yamanishi
26
3
0
23 Jul 2020
Adversarial Training Reduces Information and Improves Transferability
M. Terzi
Alessandro Achille
Marco Maggipinto
Gian Antonio Susto
AAML
24
23
0
22 Jul 2020
Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks
Alexander Immer
BDL
19
4
0
21 Jul 2020
Sequential Quadratic Optimization for Nonlinear Equality Constrained Stochastic Optimization
A. Berahas
Frank E. Curtis
Daniel P. Robinson
Baoyu Zhou
26
51
0
20 Jul 2020
Asynchronous Federated Learning with Reduced Number of Rounds and with Differential Privacy from Less Aggregated Gaussian Noise
Marten van Dijk
Nhuong V. Nguyen
Toan N. Nguyen
Lam M. Nguyen
Quoc Tran-Dinh
Phuong Ha Nguyen
FedML
26
28
0
17 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
38
5
0
15 Jul 2020
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
Jianyu Wang
Qinghua Liu
Hao Liang
Gauri Joshi
H. Vincent Poor
MoMe
FedML
23
1,304
0
15 Jul 2020
A Study of Gradient Variance in Deep Learning
Fartash Faghri
David Duvenaud
David J. Fleet
Jimmy Ba
FedML
ODL
22
27
0
09 Jul 2020
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
Shaocong Ma
Yi Zhou
27
3
0
07 Jul 2020
Efficient Learning of Generative Models via Finite-Difference Score Matching
Tianyu Pang
Kun Xu
Chongxuan Li
Yang Song
Stefano Ermon
Jun Zhu
DiffM
33
53
0
07 Jul 2020
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle Synchronization for Distributed DNN Training
Weiyan Wang
Cengguang Zhang
Liu Yang
Kai Chen
Kun Tan
34
12
0
07 Jul 2020
Doubly infinite residual neural networks: a diffusion process approach
Stefano Peluchetti
Stefano Favaro
22
2
0
07 Jul 2020
Improving Chinese Segmentation-free Word Embedding With Unsupervised Association Measure
Yifan Zhang
Maohua Wang
Yongjian Huang
Qianrong Gu
6
0
0
05 Jul 2020
Accuracy-Efficiency Trade-Offs and Accountability in Distributed ML Systems
A. Feder Cooper
K. Levy
Christopher De Sa
14
18
0
04 Jul 2020
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
36
4
0
03 Jul 2020
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems
Zhan Gao
Alec Koppel
Alejandro Ribeiro
33
10
0
02 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
42
274
0
02 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
24
54
0
02 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
23
30
0
01 Jul 2020
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Scott Pesme
Aymeric Dieuleveut
Nicolas Flammarion
25
15
0
01 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
24
10
0
30 Jun 2020
A Multilevel Approach to Training
Vanessa Braglia
Alena Kopanicáková
Rolf Krause
20
2
0
28 Jun 2020
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
26
51
0
26 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
22
27
0
26 Jun 2020
DeltaGrad: Rapid retraining of machine learning models
Yinjun Wu
Yan Sun
S. Davidson
MU
32
197
0
26 Jun 2020
Learning compositional functions via multiplicative weight updates
Jeremy Bernstein
Jiawei Zhao
M. Meister
Xuan Li
Anima Anandkumar
Yisong Yue
16
26
0
25 Jun 2020
Effective Elastic Scaling of Deep Learning Workloads
Vaibhav Saxena
K.R. Jayaram
Saurav Basu
Yogish Sabharwal
Ashish Verma
12
9
0
24 Jun 2020
Advances in Asynchronous Parallel and Distributed Optimization
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
28
76
0
24 Jun 2020
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
F. Wenzel
Jasper Snoek
Dustin Tran
Rodolphe Jenatton
UQCV
35
204
0
24 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
31
21
0
24 Jun 2020
Continuous Submodular Function Maximization
Yatao Bian
J. M. Buhmann
Andreas Krause
6
19
0
24 Jun 2020
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
17
9
0
24 Jun 2020
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
V. Mácha
Lukáš Adam
Václav Smídl
14
2
0
22 Jun 2020
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
24
61
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
30
74
0
18 Jun 2020
A block coordinate descent optimizer for classification problems exploiting convexity
Ravi G. Patel
N. Trask
Mamikon A. Gulian
E. Cyr
ODL
32
7
0
17 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
42
49
0
16 Jun 2020
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD
Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
15
11
0
15 Jun 2020
Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization
Shijing Si
Chris J. Oates
Andrew B. Duncan
Lawrence Carin
F. Briol
BDL
23
21
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
19
37
0
12 Jun 2020
Stochastic Optimization for Performative Prediction
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
8
113
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
39
131
0
10 Jun 2020
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account
Khashayar Namdar
M. Haider
Farzad Khalvati
24
26
0
08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
Wei Tao
Zhisong Pan
Gao-wei Wu
Qing Tao
6
19
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
37
25
0
08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Majid Jahani
M. Nazari
R. Tappenden
A. Berahas
Martin Takávc
ODL
19
10
0
06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
36
7
0
05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees
Yu Sun
Zihui Wu
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
BDL
40
74
0
05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Rémi Leluc
Franccois Portier
28
2
0
04 Jun 2020
Previous
1
2
3
...
18
19
20
...
27
28
29
Next