Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 866 papers shown
Title
Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients
Kyurae Kim
Jisu Oh
Jacob R. Gardner
Adji Bousso Dieng
Hongseok Kim
BDL
95
8
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
70
2
0
13 Jun 2022
A Unified Convergence Theorem for Stochastic Optimization Methods
Xiao Li
Andre Milzarek
91
12
0
08 Jun 2022
Interference Management for Over-the-Air Federated Learning in Multi-Cell Wireless Networks
Zhibin Wang
Yong Zhou
Yuanming Shi
W. Zhuang
87
72
0
06 Jun 2022
A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
140
6
0
04 Jun 2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
110
13
0
02 Jun 2022
Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis
Carles Domingo-Enrich
47
0
0
01 Jun 2022
Online Deep Equilibrium Learning for Regularization by Denoising
Jiaming Liu
Xiaojian Xu
Weijie Gan
Shirin Shoushtari
Ulugbek S. Kamilov
114
28
0
25 May 2022
Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization
Benjamin Dubois-Taine
Francis R. Bach
Quentin Berthet
Adrien B. Taylor
94
5
0
25 May 2022
Learning from time-dependent streaming data with online stochastic algorithms
Antoine Godichon-Baggioni
Nicklas Werge
Olivier Wintenberger
122
3
0
25 May 2022
Incorporating Prior Knowledge into Neural Networks through an Implicit Composite Kernel
Ziyang Jiang
Tongshu Zheng
Yiling Liu
David Carlson
73
4
0
15 May 2022
Decentralized Stochastic Optimization with Inherent Privacy Protection
Yongqiang Wang
H. Vincent Poor
99
39
0
08 May 2022
LAWS: Look Around and Warm-Start Natural Gradient Descent for Quantum Neural Networks
Zeyi Tao
Jindi Wu
Qi Xia
Qun Li
60
9
0
05 May 2022
Byzantine Fault Tolerance in Distributed Machine Learning : a Survey
Djamila Bouhata
Hamouma Moumen
Moumen Hamouma
Ahcène Bounceur
AI4CE
106
8
0
05 May 2022
FedShuffle: Recipes for Better Use of Local Work in Federated Learning
Samuel Horváth
Maziar Sanjabi
Lin Xiao
Peter Richtárik
Michael G. Rabbat
FedML
88
21
0
27 Apr 2022
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence
Sen Na
Michal Derezinski
Michael W. Mahoney
118
17
0
20 Apr 2022
FedCau: A Proactive Stop Policy for Communication and Computation Efficient Federated Learning
Afsaneh Mahmoudi
H. S. Ghadikolaei
José Hélio da Cruz Júnior
Carlo Fischione
71
9
0
16 Apr 2022
Minimizing Control for Credit Assignment with Strong Feedback
Alexander Meulemans
Matilde Tristany Farinha
Maria R. Cervera
João Sacramento
Benjamin Grewe
76
17
0
14 Apr 2022
Rethinking Exponential Averaging of the Fisher
C. Puiu
54
1
0
10 Apr 2022
Distributed Evolution Strategies for Black-box Stochastic Optimization
Xiaoyu He
Zibin Zheng
Chuan Chen
Yuren Zhou
Chuan Luo
Qingwei Lin
56
5
0
09 Apr 2022
Federated Learning with Partial Model Personalization
Krishna Pillutla
Kshitiz Malik
Abdel-rahman Mohamed
Michael G. Rabbat
Maziar Sanjabi
Lin Xiao
FedML
105
170
0
08 Apr 2022
Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Yi Tian Xu
Xiang Wang
Mingqian Tang
Changxin Gao
Rong Jin
Nong Sang
SSL
AI4TS
76
17
0
06 Apr 2022
Nonlinear gradient mappings and stochastic optimization: A general framework with applications to heavy-tail noise
D. Jakovetić
Dragana Bajović
Anit Kumar Sahu
S. Kar
Nemanja Milošević
Dusan Stamenkovic
62
14
0
06 Apr 2022
Local Stochastic Factored Gradient Descent for Distributed Quantum State Tomography
Junhyung Lyle Kim
Taha Toghani
César A. Uribe
Anastasios Kyrillidis
56
3
0
22 Mar 2022
Minimum Variance Unbiased N:M Sparsity for the Neural Gradients
Brian Chmiel
Itay Hubara
Ron Banner
Daniel Soudry
95
9
0
21 Mar 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
72
2
0
21 Mar 2022
Convergence rates of the stochastic alternating algorithm for bi-objective optimization
Suyun Liu
Luis Nunes Vicente
58
3
0
20 Mar 2022
On the Properties of Adversarially-Trained CNNs
Mattia Carletti
M. Terzi
Gian Antonio Susto
AAML
64
1
0
17 Mar 2022
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms
Pranay Sharma
Rohan Panda
Gauri Joshi
P. Varshney
FedML
114
49
0
09 Mar 2022
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
Jinxin Zhou
Xiao Li
Tian Ding
Chong You
Qing Qu
Zhihui Zhu
78
102
0
02 Mar 2022
Asynchronous Fully-Decentralized SGD in the Cluster-Based Model
Hagit Attiya
N. Schiller
FedML
91
0
0
22 Feb 2022
MSTGD:A Memory Stochastic sTratified Gradient Descent Method with an Exponential Convergence Rate
Aixiang Chen
Chen
Jinting Zhang
Zanbo Zhang
Zhihong Li
67
0
0
21 Feb 2022
Tackling benign nonconvexity with smoothing and stochastic gradients
Harsh Vardhan
Sebastian U. Stich
91
8
0
18 Feb 2022
Temporal Difference Learning with Continuous Time and State in the Stochastic Setting
Ziad Kobeissi
Francis R. Bach
OffRL
99
4
0
16 Feb 2022
Federated Learning with Sparsified Model Perturbation: Improving Accuracy under Client-Level Differential Privacy
Rui Hu
Yanmin Gong
Yuanxiong Guo
FedML
87
73
0
15 Feb 2022
The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance
Matthew Faw
Isidoros Tziotis
Constantine Caramanis
Aryan Mokhtari
Sanjay Shakkottai
Rachel A. Ward
88
61
0
11 Feb 2022
Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods
Yujia Jin
Aaron Sidford
Kevin Tian
87
31
0
09 Feb 2022
On Almost Sure Convergence Rates of Stochastic Gradient Methods
Jun Liu
Ye Yuan
97
38
0
09 Feb 2022
Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods
Amirkeivan Mohtashami
Sebastian U. Stich
Martin Jaggi
71
13
0
03 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
142
65
0
01 Feb 2022
L-SVRG and L-Katyusha with Adaptive Sampling
Boxin Zhao
Boxiang Lyu
Mladen Kolar
86
3
0
31 Jan 2022
A subsampling approach for Bayesian model selection
Jon Lachmann
G. Storvik
F. Frommlet
Aliaksadr Hubin
BDL
53
2
0
31 Jan 2022
Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning
Xing Xu
Rongpeng Li
Zhifeng Zhao
Honggang Zhang
FedML
64
6
0
30 Jan 2022
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization
G. Luca
E. Silverstein
85
11
0
26 Jan 2022
Uphill Roads to Variational Tightness: Monotonicity and Monte Carlo Objectives
Pierre-Alexandre Mattei
J. Frellsen
56
4
0
26 Jan 2022
Physics-informed ConvNet: Learning Physical Field from a Shallow Neural Network
Peng Shi
Zhi Zeng
Tianshou Liang
AI4CE
60
23
0
26 Jan 2022
Communication-Efficient Stochastic Zeroth-Order Optimization for Federated Learning
Wenzhi Fang
Ziyi Yu
Yuning Jiang
Yuanming Shi
Colin N. Jones
Yong Zhou
FedML
128
60
0
24 Jan 2022
Optimal variance-reduced stochastic approximation in Banach spaces
Wenlong Mou
K. Khamaru
Martin J. Wainwright
Peter L. Bartlett
Michael I. Jordan
85
9
0
21 Jan 2022
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Shigang Li
Torsten Hoefler
62
53
0
19 Jan 2022
On Maximum-a-Posteriori estimation with Plug & Play priors and stochastic gradient descent
R. Laumont
Valentin De Bortoli
Andrés Almansa
J. Delon
Alain Durmus
Marcelo Pereyra
67
26
0
16 Jan 2022
Previous
1
2
3
4
5
6
...
16
17
18
Next