ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,407 papers shown
Title
Train Like a (Var)Pro: Efficient Training of Neural Networks with
  Variable Projection
Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection
Elizabeth Newman
Lars Ruthotto
Joseph L. Hart
B. V. B. Waanders
AAML
36
19
0
26 Jul 2020
Online Robust and Adaptive Learning from Data Streams
Online Robust and Adaptive Learning from Data Streams
Shintaro Fukushima
Atsushi Nitanda
Kenji Yamanishi
26
3
0
23 Jul 2020
Adversarial Training Reduces Information and Improves Transferability
Adversarial Training Reduces Information and Improves Transferability
M. Terzi
Alessandro Achille
Marco Maggipinto
Gian Antonio Susto
AAML
24
23
0
22 Jul 2020
Disentangling the Gauss-Newton Method and Approximate Inference for
  Neural Networks
Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks
Alexander Immer
BDL
19
4
0
21 Jul 2020
Sequential Quadratic Optimization for Nonlinear Equality Constrained
  Stochastic Optimization
Sequential Quadratic Optimization for Nonlinear Equality Constrained Stochastic Optimization
A. Berahas
Frank E. Curtis
Daniel P. Robinson
Baoyu Zhou
26
51
0
20 Jul 2020
Asynchronous Federated Learning with Reduced Number of Rounds and with
  Differential Privacy from Less Aggregated Gaussian Noise
Asynchronous Federated Learning with Reduced Number of Rounds and with Differential Privacy from Less Aggregated Gaussian Noise
Marten van Dijk
Nhuong V. Nguyen
Toan N. Nguyen
Lam M. Nguyen
Quoc Tran-Dinh
Phuong Ha Nguyen
FedML
26
28
0
17 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
38
5
0
15 Jul 2020
Tackling the Objective Inconsistency Problem in Heterogeneous Federated
  Optimization
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
Jianyu Wang
Qinghua Liu
Hao Liang
Gauri Joshi
H. Vincent Poor
MoMe
FedML
23
1,304
0
15 Jul 2020
A Study of Gradient Variance in Deep Learning
A Study of Gradient Variance in Deep Learning
Fartash Faghri
David Duvenaud
David J. Fleet
Jimmy Ba
FedML
ODL
22
27
0
09 Jul 2020
Understanding the Impact of Model Incoherence on Convergence of
  Incremental SGD with Random Reshuffle
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
Shaocong Ma
Yi Zhou
27
3
0
07 Jul 2020
Efficient Learning of Generative Models via Finite-Difference Score
  Matching
Efficient Learning of Generative Models via Finite-Difference Score Matching
Tianyu Pang
Kun Xu
Chongxuan Li
Yang Song
Stefano Ermon
Jun Zhu
DiffM
33
53
0
07 Jul 2020
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle
  Synchronization for Distributed DNN Training
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle Synchronization for Distributed DNN Training
Weiyan Wang
Cengguang Zhang
Liu Yang
Kai Chen
Kun Tan
34
12
0
07 Jul 2020
Doubly infinite residual neural networks: a diffusion process approach
Doubly infinite residual neural networks: a diffusion process approach
Stefano Peluchetti
Stefano Favaro
22
2
0
07 Jul 2020
Improving Chinese Segmentation-free Word Embedding With Unsupervised
  Association Measure
Improving Chinese Segmentation-free Word Embedding With Unsupervised Association Measure
Yifan Zhang
Maohua Wang
Yongjian Huang
Qianrong Gu
6
0
0
05 Jul 2020
Accuracy-Efficiency Trade-Offs and Accountability in Distributed ML
  Systems
Accuracy-Efficiency Trade-Offs and Accountability in Distributed ML Systems
A. Feder Cooper
K. Levy
Christopher De Sa
14
18
0
04 Jul 2020
Weak error analysis for stochastic gradient descent optimization
  algorithms
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
36
4
0
03 Jul 2020
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
  Optimization Problems
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems
Zhan Gao
Alec Koppel
Alejandro Ribeiro
33
10
0
02 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp
  Guarantees
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
42
274
0
02 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
24
54
0
02 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
23
30
0
01 Jul 2020
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient
  Descent
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Scott Pesme
Aymeric Dieuleveut
Nicolas Flammarion
25
15
0
01 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
24
10
0
30 Jun 2020
A Multilevel Approach to Training
A Multilevel Approach to Training
Vanessa Braglia
Alena Kopanicáková
Rolf Krause
20
2
0
28 Jun 2020
Is SGD a Bayesian sampler? Well, almost
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
26
51
0
26 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq
  learners
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
22
27
0
26 Jun 2020
DeltaGrad: Rapid retraining of machine learning models
DeltaGrad: Rapid retraining of machine learning models
Yinjun Wu
Yan Sun
S. Davidson
MU
32
197
0
26 Jun 2020
Learning compositional functions via multiplicative weight updates
Learning compositional functions via multiplicative weight updates
Jeremy Bernstein
Jiawei Zhao
M. Meister
Xuan Li
Anima Anandkumar
Yisong Yue
16
26
0
25 Jun 2020
Effective Elastic Scaling of Deep Learning Workloads
Effective Elastic Scaling of Deep Learning Workloads
Vaibhav Saxena
K.R. Jayaram
Saurav Basu
Yogish Sabharwal
Ashish Verma
12
9
0
24 Jun 2020
Advances in Asynchronous Parallel and Distributed Optimization
Advances in Asynchronous Parallel and Distributed Optimization
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
28
76
0
24 Jun 2020
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
F. Wenzel
Jasper Snoek
Dustin Tran
Rodolphe Jenatton
UQCV
35
204
0
24 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
31
21
0
24 Jun 2020
Continuous Submodular Function Maximization
Continuous Submodular Function Maximization
Yatao Bian
J. M. Buhmann
Andreas Krause
6
19
0
24 Jun 2020
Local Stochastic Approximation: A Unified View of Federated Learning and
  Distributed Multi-Task Reinforcement Learning Algorithms
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
17
9
0
24 Jun 2020
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
V. Mácha
Lukáš Adam
Václav Smídl
14
2
0
22 Jun 2020
A Better Alternative to Error Feedback for Communication-Efficient
  Distributed Learning
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
24
61
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
30
74
0
18 Jun 2020
A block coordinate descent optimizer for classification problems
  exploiting convexity
A block coordinate descent optimizer for classification problems exploiting convexity
Ravi G. Patel
N. Trask
Mamikon A. Gulian
E. Cyr
ODL
32
7
0
17 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
42
49
0
16 Jun 2020
Spherical Motion Dynamics: Learning Dynamics of Neural Network with
  Normalization, Weight Decay, and SGD
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD
Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
15
11
0
15 Jun 2020
Scalable Control Variates for Monte Carlo Methods via Stochastic
  Optimization
Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization
Shijing Si
Chris J. Oates
Andrew B. Duncan
Lawrence Carin
F. Briol
BDL
23
21
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep
  neural networks
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
19
37
0
12 Jun 2020
Stochastic Optimization for Performative Prediction
Stochastic Optimization for Performative Prediction
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
8
113
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Random Reshuffling: Simple Analysis with Vast Improvements
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
39
131
0
10 Jun 2020
A Modified AUC for Training Convolutional Neural Networks: Taking
  Confidence into Account
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account
Khashayar Namdar
M. Haider
Farzad Khalvati
24
26
0
08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence
  of Nonsmooth Optimization
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
Wei Tao
Zhisong Pan
Gao-wei Wu
Qing Tao
6
19
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property
  and Average-case Analysis
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
37
25
0
08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Majid Jahani
M. Nazari
R. Tappenden
A. Berahas
Martin Takávc
ODL
19
10
0
06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
36
7
0
05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees
Scalable Plug-and-Play ADMM with Convergence Guarantees
Yu Sun
Zihui Wu
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
BDL
40
74
0
05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Rémi Leluc
Franccois Portier
28
2
0
04 Jun 2020
Previous
123...181920...272829
Next