Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 866 papers shown
Title
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
86
9
0
27 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
85
9
0
20 Jul 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
130
5
0
13 Jul 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Xu Cai
Cheuk Yin Lin
Jelena Diakonikolas
FedML
74
5
0
21 Jun 2023
Bootstrapped Representations in Reinforcement Learning
Charline Le Lan
Stephen Tu
Mark Rowland
Anna Harutyunyan
Rishabh Agarwal
Marc G. Bellemare
Will Dabney
OffRL
OOD
SSL
138
10
0
16 Jun 2023
Schema-learning and rebinding as mechanisms of in-context learning and emergence
Siva K. Swaminathan
Antoine Dedieu
Rajkumar Vasudeva Raju
Murray Shanahan
Miguel Lazaro-Gredilla
Dileep George
99
14
0
16 Jun 2023
Robustly Learning a Single Neuron via Sharpness
Puqian Wang
Nikos Zarifis
Ilias Diakonikolas
Jelena Diakonikolas
67
9
0
13 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Mingli Song
Dacheng Tao
156
15
0
05 Jun 2023
Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
Florian E. Dorner
Nikola Konstantinov
Georgi Pashaliev
Martin Vechev
FedML
144
7
0
25 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
92
17
0
21 May 2023
Online Learning Under A Separable Stochastic Approximation Framework
Min Gan
Xiang-Xiang Su
Guang-yong Chen
Jing Chen
66
0
0
12 May 2023
Over-the-Air Federated Averaging with Limited Power and Privacy Budgets
Na Yan
Kezhi Wang
Cunhua Pan
K. K. Chai
Feng Shu
Jiangzhou Wang
FedML
55
2
0
05 May 2023
Multilevel Monte Carlo estimators for derivative-free optimization under uncertainty
F. Menhorn
Gianluca Geraci
D. Seidl
Youssef M. Marzouk
M. Eldred
H. Bungartz
50
1
0
04 May 2023
A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification
Mehran Pourvahab
Seyed Jalaleddin Mousavirad
Virginie Felizardo
Pedro Gusmão
Henriques Zacarias
Hamzeh Mohammadigheymasi
Nicholas D. Lane
Seyed Nooreddin Jafari
Nuno M. Garcia
56
3
0
04 May 2023
When Deep Learning Meets Polyhedral Theory: A Survey
Joey Huchette
Gonzalo Muñoz
Thiago Serra
Calvin Tsay
AI4CE
160
37
0
29 Apr 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
130
34
0
28 Apr 2023
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space
Michael Diao
Krishnakumar Balasubramanian
Sinho Chewi
Adil Salim
BDL
68
29
0
10 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
107
5
0
03 Apr 2023
FedAgg: Adaptive Federated Learning with Aggregated Gradients
Wenhao Yuan
Xuehe Wang
FedML
145
1
0
28 Mar 2023
Forget-free Continual Learning with Soft-Winning SubNetworks
Haeyong Kang
Jaehong Yoon
Sultan Rizky Hikmawan Madjid
Sung Ju Hwang
Chang D. Yoo
CLL
105
4
0
27 Mar 2023
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Ö. Deniz Akyildiz
F. R. Crucinio
Mark Girolami
Tim Johnston
Sotirios Sabanis
138
13
0
23 Mar 2023
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent
Rahul Singh
A. Shukla
Dootika Vats
56
0
0
14 Mar 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Jaeyoung Cha
Jaewook Lee
Chulhee Yun
87
24
0
13 Mar 2023
Multi-task neural networks by learned contextual inputs
Anders T. Sandnes
B. Grimstad
O. Kolbjørnsen
51
2
0
01 Mar 2023
Maximum Likelihood With a Time Varying Parameter
Alberto Lanconelli
Christopher S. A. Lauria
52
4
0
28 Feb 2023
Statistical Inference with Stochastic Gradient Methods under
φ
φ
φ
-mixing Data
Ruiqi Liu
Xinyu Chen
Zuofeng Shang
FedML
84
6
0
24 Feb 2023
WW-FL: Secure and Private Large-Scale Federated Learning
F. Marx
T. Schneider
Ajith Suresh
Tobias Wehrle
Christian Weinert
Hossein Yalame
FedML
66
2
0
20 Feb 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
109
2
0
20 Feb 2023
On the convergence result of the gradient-push algorithm on directed graphs with constant stepsize
Woocheol Choi
Doheon Kim
S. Yun
67
1
0
17 Feb 2023
Extragradient-Type Methods with
O
(
1
/
k
)
\mathcal{O} (1/k)
O
(
1/
k
)
Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions
Quoc Tran-Dinh
81
2
0
08 Feb 2023
Improving the Model Consistency of Decentralized Federated Learning
Yi Shi
Li Shen
Kang Wei
Yan Sun
Bo Yuan
Xueqian Wang
Dacheng Tao
FedML
114
52
0
08 Feb 2023
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark Schmidt
Nicolas Le Roux
99
6
0
06 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
130
49
0
02 Feb 2023
MLPGradientFlow: going with the flow of multilayer perceptrons (and finding minima fast and accurately)
Johanni Brea
Flavio Martinelli
Berfin Simsek
W. Gerstner
35
4
0
25 Jan 2023
A Stochastic Proximal Polyak Step Size
Fabian Schaipp
Robert Mansel Gower
M. Ulbrich
64
12
0
12 Jan 2023
Federated Learning under Heterogeneous and Correlated Client Availability
Angelo Rodio
Francescomaria Faticanti
Othmane Marfoq
Giovanni Neglia
Emilio Leonardi
FedML
84
27
0
11 Jan 2023
Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation
Xiao-Tong Yuan
P. Li
89
2
0
09 Jan 2023
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems
Quoc Tran-Dinh
Yang Luo
194
8
0
08 Jan 2023
Federated Learning for Data Streams
Othmane Marfoq
Giovanni Neglia
Laetitia Kameni
Richard Vidal
FedML
82
12
0
04 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines
Ronan L. Keane
H. Gao
52
0
0
27 Dec 2022
Gradient Descent-Type Methods: Background and Simple Unified Convergence Analysis
Quoc Tran-Dinh
Marten van Dijk
58
0
0
19 Dec 2022
Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent
Richard Archibald
F. Bao
Yanzhao Cao
Hui‐Jie Sun
99
2
0
17 Dec 2022
Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks
Chung-Hsuan Hu
Zheng Chen
Erik G. Larsson
86
73
0
14 Dec 2022
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
76
14
0
14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
67
3
0
07 Dec 2022
Distributed Stochastic Gradient Descent with Cost-Sensitive and Strategic Agents
Abdullah Basar Akbay
C. Tepedelenlioğlu
FedML
53
0
0
05 Dec 2022
Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness
R. Seccia
Corrado Coppola
G. Liuzzi
L. Palagi
76
2
0
04 Dec 2022
Learning-Assisted Algorithm Unrolling for Online Optimization with Budget Constraints
Jianyi Yang
Shaolei Ren
85
2
0
03 Dec 2022
GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning
Shiqi He
Qifan Yan
Feijie Wu
Lanjun Wang
Mathias Lécuyer
Ivan Beschastnikh
FedML
82
8
0
03 Dec 2022
Impact of Redundancy on Resilience in Distributed Optimization and Learning
Shuo Liu
Nirupam Gupta
Nitin H. Vaidya
107
2
0
16 Nov 2022
Previous
1
2
3
4
5
6
...
16
17
18
Next