Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 867 papers shown
Title
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account
Khashayar Namdar
M. Haider
Farzad Khalvati
47
26
0
08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
Wei Tao
Zhisong Pan
Gao-wei Wu
Qing Tao
40
19
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
99
27
0
08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Majid Jahani
M. Nazari
R. Tappenden
A. Berahas
Martin Takávc
ODL
55
10
0
06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
130
7
0
05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees
Yu Sun
Zihui Wu
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
BDL
100
76
0
05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Rémi Leluc
Franccois Portier
88
4
0
04 Jun 2020
A mathematical model for automatic differentiation in machine learning
Jérôme Bolte
Edouard Pauwels
82
68
0
03 Jun 2020
Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations
Zheng Shi
Nur Sila Gulgec
A. Berahas
S. Pakzad
Martin Takáč
63
10
0
02 Jun 2020
Carathéodory Sampling for Stochastic Gradient Descent
Francesco Cosentino
Harald Oberhauser
Alessandro Abate
43
1
0
02 Jun 2020
Artificial neural networks for neuroscientists: A primer
G. R. Yang
Xiao-Jing Wang
107
255
0
01 Jun 2020
Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19 Pandemic: Leveraging Data Science, Epidemiology and Control Theory
Teodoro Alamo
Daniel Gutiérrez-Reina
P. Millán
52
27
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
160
287
0
01 Jun 2020
A New Accelerated Stochastic Gradient Method with Momentum
Liang Liu
Xiaopeng Luo
ODL
39
3
0
31 May 2020
Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts
Bo Pang
Kaiwen Zha
Hanwen Cao
Jiajun Tang
Minghui Yu
Cewu Lu
77
25
0
30 May 2020
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing
O. Borysenko
M. Byshkin
ODL
60
14
0
29 May 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
89
134
0
28 May 2020
Convergence Analysis of Riemannian Stochastic Approximation Schemes
Alain Durmus
P. Jiménez
Eric Moulines
Salem Said
Hoi-To Wai
72
10
0
27 May 2020
Scalable Privacy-Preserving Distributed Learning
D. Froelicher
J. Troncoso-Pastoriza
Apostolos Pyrgelis
Sinem Sav
João Sá Sousa
Jean-Philippe Bossuat
Jean-Pierre Hubaux
FedML
97
70
0
19 May 2020
PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking
Chong Xiang
A. Bhagoji
Vikash Sehwag
Prateek Mittal
AAML
75
29
0
17 May 2020
S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs
Muhammad I. Qureshi
Ran Xin
S. Kar
U. Khan
92
34
0
15 May 2020
Interpreting Rate-Distortion of Variational Autoencoder and Using Model Uncertainty for Anomaly Detection
Seonho Park
George Adosoglou
P. Pardalos
DRL
UQCV
105
17
0
05 May 2020
Dynamic backup workers for parallel machine learning
Chuan Xu
Giovanni Neglia
Nicola Sebastianelli
72
11
0
30 Apr 2020
The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent
Xin-Yao Qian
Diego Klabjan
ODL
72
36
0
27 Apr 2020
Heterogeneous CPU+GPU Stochastic Gradient Descent Algorithms
Yujing Ma
Florin Rusu
33
3
0
19 Apr 2020
On Learning Rates and Schrödinger Operators
Bin Shi
Weijie J. Su
Michael I. Jordan
95
61
0
15 Apr 2020
Stochastic batch size for adaptive regularization in deep network optimization
Kensuke Nakamura
Stefano Soatto
Byung-Woo Hong
ODL
51
6
0
14 Apr 2020
Straggler-aware Distributed Learning: Communication Computation Latency Trade-off
Emre Ozfatura
S. Ulukus
Deniz Gunduz
56
42
0
10 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
Wenlong Mou
C. J. Li
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
85
76
0
09 Apr 2020
Deep Neural Network Learning with Second-Order Optimizers -- a Practical Study with a Stochastic Quasi-Gauss-Newton Method
C. Thiele
Mauricio Araya-Polo
D. Hohl
ODL
35
2
0
06 Apr 2020
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions
V. Patel
81
23
0
01 Apr 2020
Concentrated Differentially Private and Utility Preserving Federated Learning
Rui Hu
Yuanxiong Guo
Yanmin Gong
FedML
66
12
0
30 Mar 2020
Differentially Private Federated Learning for Resource-Constrained Internet of Things
Rui Hu
Yuanxiong Guo
E. Ratazzi
Yanmin Gong
FedML
62
18
0
28 Mar 2020
A Hybrid-Order Distributed SGD Method for Non-Convex Optimization to Balance Communication Overhead, Computational Complexity, and Convergence Rate
Naeimeh Omidvar
M. Maddah-ali
Hamed Mahdavi
ODL
42
3
0
27 Mar 2020
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence
Abhishek Gupta
W. Haskell
31
5
0
25 Mar 2020
Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness
Thinh T. Doan
Lam M. Nguyen
Nhan H. Pham
Justin Romberg
75
22
0
24 Mar 2020
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
Anastasia Koloskova
Nicolas Loizou
Sadra Boreiri
Martin Jaggi
Sebastian U. Stich
FedML
95
518
0
23 Mar 2020
Block Layer Decomposition schemes for training Deep Neural Networks
L. Palagi
R. Seccia
47
5
0
18 Mar 2020
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
Alnur Ali
Yan Sun
Robert Tibshirani
103
77
0
17 Mar 2020
Dynamic transformation of prior knowledge into Bayesian models for data streams
Tran Xuan Bach
N. Anh
Ngo Van Linh
Khoat Than
69
9
0
13 Mar 2020
Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning
Christopher Zach
Huu Le
55
4
0
12 Mar 2020
Machine Learning on Volatile Instances
Xiaoxi Zhang
Jianyu Wang
Gauri Joshi
Carlee Joe-Wong
56
25
0
12 Mar 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Mahmoud Assran
Michael G. Rabbat
78
59
0
27 Feb 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models
Yinjun Wu
V. Tannen
S. Davidson
86
37
0
26 Feb 2020
LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning
Tianyi Chen
Yuejiao Sun
W. Yin
FedML
47
14
0
26 Feb 2020
Device Heterogeneity in Federated Learning: A Superquantile Approach
Yassine Laguel
Krishna Pillutla
J. Malick
Zaïd Harchaoui
FedML
94
22
0
25 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang
Jie Qin
Li Liu
Fan Zhu
Ling Shao
AI4CE
86
11
0
25 Feb 2020
Can speed up the convergence rate of stochastic gradient methods to
O
(
1
/
k
2
)
\mathcal{O}(1/k^2)
O
(
1/
k
2
)
by a gradient averaging strategy?
Xin Xu
Xiaopeng Luo
23
1
0
25 Feb 2020
Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent
Bao Wang
T. Nguyen
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ODL
77
49
0
24 Feb 2020
Previous
1
2
3
...
10
11
12
...
16
17
18
Next