ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,407 papers shown
Title
Convergence of stochastic gradient descent under a local Lojasiewicz
  condition for deep neural networks
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
Jing An
Jianfeng Lu
19
4
0
18 Apr 2023
Fast Neural Scene Flow
Fast Neural Scene Flow
Xueqian Li
Jianqiao Zheng
Francesco Ferroni
J. K. Pontes
Simon Lucey
29
27
0
18 Apr 2023
Fast and Straggler-Tolerant Distributed SGD with Reduced Computation
  Load
Fast and Straggler-Tolerant Distributed SGD with Reduced Computation Load
Maximilian Egger
Serge Kas Hanna
Rawad Bitar
FedML
27
0
0
17 Apr 2023
Communication and Energy Efficient Wireless Federated Learning with
  Intrinsic Privacy
Communication and Energy Efficient Wireless Federated Learning with Intrinsic Privacy
Zhenxiao Zhang
Yuanxiong Guo
Yuguang Fang
Yanmin Gong
36
4
0
15 Apr 2023
Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator
Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator
Haobo Qi
Feifei Wang
Hansheng Wang
30
13
0
13 Apr 2023
Automatic Gradient Descent: Deep Learning without Hyperparameters
Automatic Gradient Descent: Deep Learning without Hyperparameters
Jeremy Bernstein
Chris Mingard
Kevin Huang
Navid Azizan
Yisong Yue
ODL
16
17
0
11 Apr 2023
Forward-backward Gaussian variational inference via JKO in the
  Bures-Wasserstein Space
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space
Michael Diao
Krishnakumar Balasubramanian
Sinho Chewi
Adil Salim
BDL
32
21
0
10 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares
  SGD with smooth covariance
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance
Krishnakumar Balasubramanian
Promit Ghosal
Ye He
38
5
0
03 Apr 2023
Fast Convergence of Random Reshuffling under Over-Parameterization and
  the Polyak-Łojasiewicz Condition
Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Chen Fan
Christos Thrampoulidis
Mark W. Schmidt
33
2
0
02 Apr 2023
Doubly Stochastic Models: Learning with Unbiased Label Noises and
  Inference Stability
Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability
Haoyi Xiong
Xuhong Li
Bo Yu
Zhanxing Zhu
Dongrui Wu
Dejing Dou
NoLa
14
0
0
01 Apr 2023
Unified analysis of SGD-type methods
Unified analysis of SGD-type methods
Eduard A. Gorbunov
30
2
0
29 Mar 2023
FedAgg: Adaptive Federated Learning with Aggregated Gradients
FedAgg: Adaptive Federated Learning with Aggregated Gradients
Wenhao Yuan
Xuehe Wang
FedML
48
0
0
28 Mar 2023
Forget-free Continual Learning with Soft-Winning SubNetworks
Forget-free Continual Learning with Soft-Winning SubNetworks
Haeyong Kang
Jaehong Yoon
Sultan Rizky Hikmawan Madjid
Sung Ju Hwang
Chang D. Yoo
CLL
36
4
0
27 Mar 2023
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Ö. Deniz Akyildiz
F. R. Crucinio
Mark Girolami
Tim Johnston
Sotirios Sabanis
32
12
0
23 Mar 2023
Make Landscape Flatter in Differentially Private Federated Learning
Make Landscape Flatter in Differentially Private Federated Learning
Yi Shi
Yingqi Liu
Kang Wei
Li Shen
Xueqian Wang
Dacheng Tao
FedML
25
54
0
20 Mar 2023
Practical and Matching Gradient Variance Bounds for Black-Box
  Variational Bayesian Inference
Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference
Kyurae Kim
Kaiwen Wu
Jisu Oh
Jacob R. Gardner
BDL
31
7
0
18 Mar 2023
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient
  Descent
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent
Rahul Singh
A. Shukla
Dootika Vats
32
0
0
14 Mar 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Jaeyoung Cha
Jaewook Lee
Chulhee Yun
28
23
0
13 Mar 2023
Boosting Distributed Full-graph GNN Training with Asynchronous One-bit
  Communication
Boosting Distributed Full-graph GNN Training with Asynchronous One-bit Communication
Mengdie Zhang
Qi Hu
Peng Sun
Yonggang Wen
Tianwei Zhang
GNN
40
5
0
02 Mar 2023
Multi-task neural networks by learned contextual inputs
Multi-task neural networks by learned contextual inputs
Anders T. Sandnes
B. Grimstad
O. Kolbjørnsen
14
1
0
01 Mar 2023
Dimension-reduced KRnet maps for high-dimensional Bayesian inverse
  problems
Dimension-reduced KRnet maps for high-dimensional Bayesian inverse problems
Yani Feng
Keju Tang
Xiaoliang Wan
Qifeng Liao
19
2
0
01 Mar 2023
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Tianbo Li
Min Lin
Zheyuan Hu
Kunhao Zheng
G. Vignale
Kenji Kawaguchi
A. Neto
K. Novoselov
Shuicheng Yan
162
9
0
01 Mar 2023
Maximum Likelihood With a Time Varying Parameter
Maximum Likelihood With a Time Varying Parameter
Alberto Lanconelli
Christopher S. A. Lauria
16
3
0
28 Feb 2023
Stochastic Gradient Descent under Markovian Sampling Schemes
Stochastic Gradient Descent under Markovian Sampling Schemes
Mathieu Even
19
28
0
28 Feb 2023
How optimal transport can tackle gender biases in multi-class
  neural-network classifiers for job recommendations?
How optimal transport can tackle gender biases in multi-class neural-network classifiers for job recommendations?
Fanny Jourdan
Titon Tshiongo Kaninku
Nicholas M. Asher
Jean-Michel Loubes
Laurent Risser
FaML
26
4
0
27 Feb 2023
Scalable Neural Network Training over Distributed Graphs
Scalable Neural Network Training over Distributed Graphs
Aashish Kolluri
Sarthak Choudhary
Bryan Hooi
Prateek Saxena
GNN
21
0
0
25 Feb 2023
Statistical Inference with Stochastic Gradient Methods under
  $φ$-mixing Data
Statistical Inference with Stochastic Gradient Methods under φφφ-mixing Data
Ruiqi Liu
Xinyu Chen
Zuofeng Shang
FedML
19
6
0
24 Feb 2023
Why Target Networks Stabilise Temporal Difference Methods
Why Target Networks Stabilise Temporal Difference Methods
Matt Fellows
Matthew Smith
Shimon Whiteson
OOD
AAML
21
7
0
24 Feb 2023
Advancements in Federated Learning: Models, Methods, and Privacy
Advancements in Federated Learning: Models, Methods, and Privacy
Hui Chen
Huandong Wang
Qingyue Long
Depeng Jin
Yong Li
FedML
44
14
0
22 Feb 2023
Stochastic Approximation Beyond Gradient for Signal Processing and
  Machine Learning
Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning
Aymeric Dieuleveut
G. Fort
Eric Moulines
Hoi-To Wai
59
12
0
22 Feb 2023
WW-FL: Secure and Private Large-Scale Federated Learning
WW-FL: Secure and Private Large-Scale Federated Learning
F. Marx
T. Schneider
Ajith Suresh
Tobias Wehrle
Christian Weinert
Hossein Yalame
FedML
25
2
0
20 Feb 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression
Bhavya Agrawalla
Krishnakumar Balasubramanian
Promit Ghosal
25
2
0
20 Feb 2023
MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio
  for Multi-Task Learning
MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio for Multi-Task Learning
Caoyun Fan
Wenqing Chen
Jidong Tian
Yitian Li
Hao He
Yaohui Jin
19
2
0
18 Feb 2023
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to
  Unknown Parameters, Unbounded Gradients and Affine Variance
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance
Amit Attia
Tomer Koren
ODL
22
25
0
17 Feb 2023
On the convergence result of the gradient-push algorithm on directed
  graphs with constant stepsize
On the convergence result of the gradient-push algorithm on directed graphs with constant stepsize
Woocheol Choi
Doheon Kim
S. Yun
29
1
0
17 Feb 2023
A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk
  Minimization
A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization
Mathieu Dagréou
Thomas Moreau
Samuel Vaiter
Pierre Ablin
39
12
0
17 Feb 2023
Statistically Optimal Force Aggregation for Coarse-Graining Molecular
  Dynamics
Statistically Optimal Force Aggregation for Coarse-Graining Molecular Dynamics
Andreas Krämer
Aleksander E. P. Durumeric
N. Charron
Yaoyi Chen
C. Clementi
Frank Noé
AI4CE
30
20
0
14 Feb 2023
Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD
Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD
Matthew Faw
Litu Rout
C. Caramanis
Sanjay Shakkottai
21
37
0
13 Feb 2023
Near-Optimal Non-Convex Stochastic Optimization under Generalized
  Smoothness
Near-Optimal Non-Convex Stochastic Optimization under Generalized Smoothness
Zijian Liu
Srikanth Jagabathula
Zhengyuan Zhou
24
5
0
13 Feb 2023
Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than
  Constant Stepsize
Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize
Mert Gurbuzbalaban
Yuanhan Hu
Umut Simsekli
Lingjiong Zhu
LRM
23
1
0
10 Feb 2023
On the Convergence of Stochastic Gradient Descent for Linear Inverse
  Problems in Banach Spaces
On the Convergence of Stochastic Gradient Descent for Linear Inverse Problems in Banach Spaces
Ž. Kereta
Bangti Jin
26
6
0
10 Feb 2023
On the Privacy-Robustness-Utility Trilemma in Distributed Learning
On the Privacy-Robustness-Utility Trilemma in Distributed Learning
Youssef Allouah
R. Guerraoui
Nirupam Gupta
Rafael Pinot
John Stephan
FedML
26
21
0
09 Feb 2023
Extragradient-Type Methods with $\mathcal{O} (1/k)$ Last-Iterate
  Convergence Rates for Co-Hypomonotone Inclusions
Extragradient-Type Methods with O(1/k)\mathcal{O} (1/k)O(1/k) Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions
Quoc Tran-Dinh
31
2
0
08 Feb 2023
Improving the Model Consistency of Decentralized Federated Learning
Improving the Model Consistency of Decentralized Federated Learning
Yi Shi
Li Shen
Kang Wei
Yan Sun
Bo Yuan
Xueqian Wang
Dacheng Tao
FedML
36
51
0
08 Feb 2023
Target-based Surrogates for Stochastic Optimization
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark W. Schmidt
Nicolas Le Roux
55
5
0
06 Feb 2023
Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity
Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity
Youssef Allouah
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
Rafael Pinot
John Stephan
FedML
45
49
0
03 Feb 2023
Rethinking Semi-Supervised Medical Image Segmentation: A
  Variance-Reduction Perspective
Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective
Chenyu You
Weicheng Dai
Yifei Min
Fenglin Liu
David A. Clifton
S. Kevin Zhou
Lawrence H. Staib
James S Duncan
23
69
0
03 Feb 2023
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Deep networks for system identification: a Survey
Deep networks for system identification: a Survey
G. Pillonetto
Aleksandr Aravkin
Daniel Gedon
L. Ljung
Antônio H. Ribeiro
Thomas B. Schon
OOD
37
36
0
30 Jan 2023
Distributed Stochastic Optimization under a General Variance Condition
Distributed Stochastic Optimization under a General Variance Condition
Kun-Yen Huang
Xiao Li
Shin-Yi Pu
FedML
43
6
0
30 Jan 2023
Previous
123...789...272829
Next