Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 867 papers shown
Title
On Communication Compression for Distributed Optimization on Heterogeneous Data
Sebastian U. Stich
97
23
0
04 Sep 2020
Learning explanations that are hard to vary
Giambattista Parascandolo
Alexander Neitz
Antonio Orvieto
Luigi Gresele
Bernhard Schölkopf
FAtt
90
187
0
01 Sep 2020
Beyond variance reduction: Understanding the true impact of baselines on policy optimization
Wesley Chung
Valentin Thomas
Marlos C. Machado
Nicolas Le Roux
OffRL
114
24
0
31 Aug 2020
Wireless for Machine Learning
Henrik Hellström
J. M. B. D. Silva
Mohammad Mohammadi Amiri
Mingzhe Chen
Viktoria Fodor
H. Vincent Poor
Carlo Fischione
85
18
0
31 Aug 2020
Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum
Jerry Chee
Ping Li
45
12
0
27 Aug 2020
Optimization with learning-informed differential equation constraints and its applications
Guozhi Dong
M. Hintermueller
Kostas Papafitsoros
PINN
56
14
0
25 Aug 2020
Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization
Tianyi Chen
Yuejiao Sun
W. Yin
139
82
0
25 Aug 2020
Channel-Directed Gradients for Optimization of Convolutional Neural Networks
Dong Lao
Peihao Zhu
Peter Wonka
G. Sundaramoorthi
96
3
0
25 Aug 2020
Data-Driven Aerospace Engineering: Reframing the Industry with Machine Learning
Steven L. Brunton
J. Nathan Kutz
Krithika Manohar
Aleksandr Aravkin
K. Morgansen
...
J. Buttrick
Jeffrey Poskin
Agnes Blom-Schieber
Thomas Hogan
Darren McDonald
AI4CE
68
132
0
24 Aug 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
104
13
0
17 Aug 2020
Fast decentralized non-convex finite-sum optimization with recursive variance reduction
Ran Xin
U. Khan
S. Kar
108
43
0
17 Aug 2020
Privacy-Preserving Distributed Learning Framework for 6G Telecom Ecosystems
P. Safari
B. Shariati
J. Fischer
FedML
24
6
0
17 Aug 2020
Push-SAGA: A decentralized stochastic algorithm with variance reduction over directed graphs
Muhammad I. Qureshi
Ran Xin
S. Kar
U. Khan
97
21
0
13 Aug 2020
Byzantine Fault-Tolerant Distributed Machine Learning Using Stochastic Gradient Descent (SGD) and Norm-Based Comparative Gradient Elimination (CGE)
Nirupam Gupta
Shuo Liu
Nitin H. Vaidya
FedML
84
11
0
11 Aug 2020
An improved convergence analysis for decentralized online stochastic non-convex optimization
Ran Xin
U. Khan
S. Kar
118
104
0
10 Aug 2020
A Survey on Large-scale Machine Learning
Meng Wang
Weijie Fu
Xiangnan He
Shijie Hao
Xindong Wu
84
112
0
10 Aug 2020
DINE: A Framework for Deep Incomplete Network Embedding
Ke Hou
Jiaying Liu
Yin Peng
Bo Xu
Ivan Lee
Xiwei Xu
37
3
0
09 Aug 2020
Large-time asymptotics in deep learning
Carlos Esteve
Borjan Geshkovski
Dario Pighin
Enrique Zuazua
371
34
0
06 Aug 2020
Accelerating Federated Learning over Reliability-Agnostic Clients in Mobile Edge Computing Systems
Wentai Wu
Ligang He
Weiwei Lin
Rui Mao
70
81
0
28 Jul 2020
A Comparison of Optimization Algorithms for Deep Learning
Derya Soydaner
159
159
0
28 Jul 2020
Multi-Level Local SGD for Heterogeneous Hierarchical Networks
Timothy Castiglia
Anirban Das
S. Patterson
69
13
0
27 Jul 2020
Binary Search and First Order Gradient Based Method for Stochastic Optimization
V. Pandey
ODL
41
0
0
27 Jul 2020
Adversarial Training Reduces Information and Improves Transferability
M. Terzi
Alessandro Achille
Marco Maggipinto
Gian Antonio Susto
AAML
106
23
0
22 Jul 2020
Sequential Quadratic Optimization for Nonlinear Equality Constrained Stochastic Optimization
A. Berahas
Frank E. Curtis
Daniel P. Robinson
Baoyu Zhou
63
54
0
20 Jul 2020
Asynchronous Federated Learning with Reduced Number of Rounds and with Differential Privacy from Less Aggregated Gaussian Noise
Marten van Dijk
Nhuong V. Nguyen
Toan N. Nguyen
Lam M. Nguyen
Quoc Tran-Dinh
Phuong Ha Nguyen
FedML
89
29
0
17 Jul 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
72
5
0
15 Jul 2020
Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization
Jianyu Wang
Qinghua Liu
Hao Liang
Gauri Joshi
H. Vincent Poor
MoMe
FedML
75
1,366
0
15 Jul 2020
A Study of Gradient Variance in Deep Learning
Fartash Faghri
David Duvenaud
David J. Fleet
Jimmy Ba
FedML
ODL
59
27
0
09 Jul 2020
Efficient Learning of Generative Models via Finite-Difference Score Matching
Tianyu Pang
Kun Xu
Chongxuan Li
Yang Song
Stefano Ermon
Jun Zhu
DiffM
104
55
0
07 Jul 2020
DS-Sync: Addressing Network Bottlenecks with Divide-and-Shuffle Synchronization for Distributed DNN Training
Weiyan Wang
Cengguang Zhang
Liu Yang
Kai Chen
Kun Tan
75
12
0
07 Jul 2020
Doubly infinite residual neural networks: a diffusion process approach
Stefano Peluchetti
Stefano Favaro
39
2
0
07 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
125
281
0
02 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
92
54
0
02 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
59
31
0
01 Jul 2020
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
79
53
0
26 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
79
27
0
26 Jun 2020
DeltaGrad: Rapid retraining of machine learning models
Yinjun Wu
Yan Sun
S. Davidson
MU
82
202
0
26 Jun 2020
Learning compositional functions via multiplicative weight updates
Jeremy Bernstein
Jiawei Zhao
M. Meister
Xuan Li
Anima Anandkumar
Yisong Yue
81
27
0
25 Jun 2020
Advances in Asynchronous Parallel and Distributed Optimization
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
79
78
0
24 Jun 2020
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
F. Wenzel
Jasper Snoek
Dustin Tran
Rodolphe Jenatton
UQCV
117
212
0
24 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
72
21
0
24 Jun 2020
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
61
10
0
24 Jun 2020
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
79
60
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
135
76
0
18 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
148
50
0
16 Jun 2020
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD
Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
78
11
0
15 Jun 2020
Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization
Shijing Si
Chris J. Oates
Andrew B. Duncan
Lawrence Carin
F. Briol
BDL
65
21
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
80
37
0
12 Jun 2020
Stochastic Optimization for Performative Prediction
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
63
115
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
133
135
0
10 Jun 2020
Previous
1
2
3
...
9
10
11
...
16
17
18
Next