ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown
Title
The Marginal Value of Momentum for Small Learning Rate SGD
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
86
9
0
27 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters
  and Non-ergodic Case
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
85
9
0
20 Jul 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
130
5
0
13 Jul 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective
  and Improved Bounds
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Xu Cai
Cheuk Yin Lin
Jelena Diakonikolas
FedML
74
5
0
21 Jun 2023
Bootstrapped Representations in Reinforcement Learning
Bootstrapped Representations in Reinforcement Learning
Charline Le Lan
Stephen Tu
Mark Rowland
Anna Harutyunyan
Rishabh Agarwal
Marc G. Bellemare
Will Dabney
OffRLOODSSL
138
10
0
16 Jun 2023
Schema-learning and rebinding as mechanisms of in-context learning and
  emergence
Schema-learning and rebinding as mechanisms of in-context learning and emergence
Siva K. Swaminathan
Antoine Dedieu
Rajkumar Vasudeva Raju
Murray Shanahan
Miguel Lazaro-Gredilla
Dileep George
99
14
0
16 Jun 2023
Robustly Learning a Single Neuron via Sharpness
Robustly Learning a Single Neuron via Sharpness
Puqian Wang
Nikos Zarifis
Ilias Diakonikolas
Jelena Diakonikolas
67
9
0
13 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically
  Equivalent
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Mingli Song
Dacheng Tao
156
15
0
05 Jun 2023
Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
Incentivizing Honesty among Competitors in Collaborative Learning and Optimization
Florian E. Dorner
Nikola Konstantinov
Georgi Pashaliev
Martin Vechev
FedML
142
7
0
25 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
92
17
0
21 May 2023
Online Learning Under A Separable Stochastic Approximation Framework
Online Learning Under A Separable Stochastic Approximation Framework
Min Gan
Xiang-Xiang Su
Guang-yong Chen
Jing Chen
66
0
0
12 May 2023
Over-the-Air Federated Averaging with Limited Power and Privacy Budgets
Over-the-Air Federated Averaging with Limited Power and Privacy Budgets
Na Yan
Kezhi Wang
Cunhua Pan
K. K. Chai
Feng Shu
Jiangzhou Wang
FedML
55
2
0
05 May 2023
Multilevel Monte Carlo estimators for derivative-free optimization under
  uncertainty
Multilevel Monte Carlo estimators for derivative-free optimization under uncertainty
F. Menhorn
Gianluca Geraci
D. Seidl
Youssef M. Marzouk
M. Eldred
H. Bungartz
50
1
0
04 May 2023
A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a
  Local Search for ECG Signal Classification
A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification
Mehran Pourvahab
Seyed Jalaleddin Mousavirad
Virginie Felizardo
Pedro Gusmão
Henriques Zacarias
Hamzeh Mohammadigheymasi
Nicholas D. Lane
Seyed Nooreddin Jafari
Nuno M. Garcia
56
3
0
04 May 2023
When Deep Learning Meets Polyhedral Theory: A Survey
When Deep Learning Meets Polyhedral Theory: A Survey
Joey Huchette
Gonzalo Muñoz
Thiago Serra
Calvin Tsay
AI4CE
160
37
0
29 Apr 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
130
34
0
28 Apr 2023
Forward-backward Gaussian variational inference via JKO in the
  Bures-Wasserstein Space
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space
Michael Diao
Krishnakumar Balasubramanian
Sinho Chewi
Adil Salim
BDL
68
29
0
10 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares
  SGD with smooth covariance
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
107
5
0
03 Apr 2023
FedAgg: Adaptive Federated Learning with Aggregated Gradients
FedAgg: Adaptive Federated Learning with Aggregated Gradients
Wenhao Yuan
Xuehe Wang
FedML
145
1
0
28 Mar 2023
Forget-free Continual Learning with Soft-Winning SubNetworks
Forget-free Continual Learning with Soft-Winning SubNetworks
Haeyong Kang
Jaehong Yoon
Sultan Rizky Hikmawan Madjid
Sung Ju Hwang
Chang D. Yoo
CLL
105
4
0
27 Mar 2023
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Ö. Deniz Akyildiz
F. R. Crucinio
Mark Girolami
Tim Johnston
Sotirios Sabanis
138
13
0
23 Mar 2023
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient
  Descent
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent
Rahul Singh
A. Shukla
Dootika Vats
56
0
0
14 Mar 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Jaeyoung Cha
Jaewook Lee
Chulhee Yun
87
24
0
13 Mar 2023
Multi-task neural networks by learned contextual inputs
Multi-task neural networks by learned contextual inputs
Anders T. Sandnes
B. Grimstad
O. Kolbjørnsen
51
2
0
01 Mar 2023
Maximum Likelihood With a Time Varying Parameter
Maximum Likelihood With a Time Varying Parameter
Alberto Lanconelli
Christopher S. A. Lauria
52
4
0
28 Feb 2023
Statistical Inference with Stochastic Gradient Methods under
  $φ$-mixing Data
Statistical Inference with Stochastic Gradient Methods under φφφ-mixing Data
Ruiqi Liu
Xinyu Chen
Zuofeng Shang
FedML
84
6
0
24 Feb 2023
WW-FL: Secure and Private Large-Scale Federated Learning
WW-FL: Secure and Private Large-Scale Federated Learning
F. Marx
T. Schneider
Ajith Suresh
Tobias Wehrle
Christian Weinert
Hossein Yalame
FedML
66
2
0
20 Feb 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
109
2
0
20 Feb 2023
On the convergence result of the gradient-push algorithm on directed
  graphs with constant stepsize
On the convergence result of the gradient-push algorithm on directed graphs with constant stepsize
Woocheol Choi
Doheon Kim
S. Yun
67
1
0
17 Feb 2023
Extragradient-Type Methods with $\mathcal{O} (1/k)$ Last-Iterate
  Convergence Rates for Co-Hypomonotone Inclusions
Extragradient-Type Methods with O(1/k)\mathcal{O} (1/k)O(1/k) Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions
Quoc Tran-Dinh
81
2
0
08 Feb 2023
Improving the Model Consistency of Decentralized Federated Learning
Improving the Model Consistency of Decentralized Federated Learning
Yi Shi
Li Shen
Kang Wei
Yan Sun
Bo Yuan
Xueqian Wang
Dacheng Tao
FedML
114
52
0
08 Feb 2023
Target-based Surrogates for Stochastic Optimization
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark Schmidt
Nicolas Le Roux
99
6
0
06 Feb 2023
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
130
49
0
02 Feb 2023
MLPGradientFlow: going with the flow of multilayer perceptrons (and
  finding minima fast and accurately)
MLPGradientFlow: going with the flow of multilayer perceptrons (and finding minima fast and accurately)
Johanni Brea
Flavio Martinelli
Berfin Simsek
W. Gerstner
35
4
0
25 Jan 2023
A Stochastic Proximal Polyak Step Size
A Stochastic Proximal Polyak Step Size
Fabian Schaipp
Robert Mansel Gower
M. Ulbrich
64
12
0
12 Jan 2023
Federated Learning under Heterogeneous and Correlated Client
  Availability
Federated Learning under Heterogeneous and Correlated Client Availability
Angelo Rodio
Francescomaria Faticanti
Othmane Marfoq
Giovanni Neglia
Emilio Leonardi
FedML
84
27
0
11 Jan 2023
Sharper Analysis for Minibatch Stochastic Proximal Point Methods:
  Stability, Smoothness, and Deviation
Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation
Xiao-Tong Yuan
P. Li
89
2
0
09 Jan 2023
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems
Quoc Tran-Dinh
Yang Luo
194
8
0
08 Jan 2023
Federated Learning for Data Streams
Federated Learning for Data Streams
Othmane Marfoq
Giovanni Neglia
Laetitia Kameni
Richard Vidal
FedML
82
12
0
04 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines
Variance Reduction for Score Functions Using Optimal Baselines
Ronan L. Keane
H. Gao
52
0
0
27 Dec 2022
Gradient Descent-Type Methods: Background and Simple Unified Convergence
  Analysis
Gradient Descent-Type Methods: Background and Simple Unified Convergence Analysis
Quoc Tran-Dinh
Marten van Dijk
58
0
0
19 Dec 2022
Convergence Analysis for Training Stochastic Neural Networks via
  Stochastic Gradient Descent
Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent
Richard Archibald
F. Bao
Yanzhao Cao
Hui‐Jie Sun
99
2
0
17 Dec 2022
Scheduling and Aggregation Design for Asynchronous Federated Learning
  over Wireless Networks
Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks
Chung-Hsuan Hu
Zheng Chen
Erik G. Larsson
86
73
0
14 Dec 2022
Learning useful representations for shifting tasks and distributions
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
76
14
0
14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
  Evasion of Non-Degenerate Saddle Points
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
67
3
0
07 Dec 2022
Distributed Stochastic Gradient Descent with Cost-Sensitive and
  Strategic Agents
Distributed Stochastic Gradient Descent with Cost-Sensitive and Strategic Agents
Abdullah Basar Akbay
C. Tepedelenlioğlu
FedML
53
0
0
05 Dec 2022
Convergence of ease-controlled Random Reshuffling gradient Algorithms
  under Lipschitz smoothness
Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness
R. Seccia
Corrado Coppola
G. Liuzzi
L. Palagi
76
2
0
04 Dec 2022
Learning-Assisted Algorithm Unrolling for Online Optimization with
  Budget Constraints
Learning-Assisted Algorithm Unrolling for Online Optimization with Budget Constraints
Jianyi Yang
Shaolei Ren
85
2
0
03 Dec 2022
GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth
  Efficient Federated Learning
GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning
Shiqi He
Qifan Yan
Feijie Wu
Lanjun Wang
Mathias Lécuyer
Ivan Beschastnikh
FedML
82
8
0
03 Dec 2022
Impact of Redundancy on Resilience in Distributed Optimization and
  Learning
Impact of Redundancy on Resilience in Distributed Optimization and Learning
Shuo Liu
Nirupam Gupta
Nitin H. Vaidya
107
2
0
16 Nov 2022
Previous
123456...161718
Next