Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.13838
Cited By
Advances in Asynchronous Parallel and Distributed Optimization
24 June 2020
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Advances in Asynchronous Parallel and Distributed Optimization"
45 / 45 papers shown
Title
Asynchronous Stochastic Gradient Descent with Decoupled Backpropagation and Layer-Wise Updates
Cabrel Teguemne Fokam
Khaleelulla Khan Nazeer
Lukas König
David Kappel
Anand Subramoney
56
0
0
08 Oct 2024
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems
Filip Hanzely
D. Kovalev
Peter Richtárik
70
17
0
11 Feb 2020
Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates
D. Kovalev
Konstantin Mishchenko
Peter Richtárik
ODL
51
45
0
03 Dec 2019
SySCD: A System-Aware Parallel Coordinate Descent Algorithm
Nikolas Ioannou
Celestine Mendler-Dünner
Thomas Parnell
100
3
0
18 Nov 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
Jianyu Wang
Vinayak Tantia
Nicolas Ballas
Michael G. Rabbat
58
201
0
01 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
612
24,431
0
26 Jul 2019
Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning
Shi Pu
Alexander Olshevsky
I. Paschalidis
53
41
0
28 Jun 2019
On Linear Learning with Manycore Processors
Eliza Wszola
Celestine Mendler-Dünner
Martin Jaggi
Markus Püschel
42
1
0
02 May 2019
An Asynchronous, Decentralized Solution Framework for the Large Scale Unit Commitment Problem
P. Ramanan
Murat Yildirim
Edmond Chow
N. Gebraeel
133
22
0
07 Apr 2019
Measuring scheduling efficiency of RNNs for NLP applications
Urmish Thakker
Ganesh S. Dasika
Jesse G. Beu
Matthew Mattina
50
13
0
05 Apr 2019
Stochastic Gradient Push for Distributed Deep Learning
Mahmoud Assran
Nicolas Loizou
Nicolas Ballas
Michael G. Rabbat
76
345
0
27 Nov 2018
SEGA: Variance Reduction via Gradient Sketching
Filip Hanzely
Konstantin Mishchenko
Peter Richtárik
50
71
0
09 Sep 2018
AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs
Jiaqi Zhang
Keyou You
36
74
0
13 Aug 2018
Exploring the Limits of Weakly Supervised Pretraining
D. Mahajan
Ross B. Girshick
Vignesh Ramanathan
Kaiming He
Manohar Paluri
Yixuan Li
Ashwin R. Bharambe
Laurens van der Maaten
VLM
180
1,367
0
02 May 2018
Asynchronous Gradient-Push
Mahmoud Assran
Michael G. Rabbat
51
64
0
23 Mar 2018
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption
Lam M. Nguyen
Phuong Ha Nguyen
Marten van Dijk
Peter Richtárik
K. Scheinberg
Martin Takáč
66
228
0
11 Feb 2018
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Wei Zhang
Ce Zhang
Ji Liu
ODL
46
500
0
18 Oct 2017
Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization
A. Nedić
Alexander Olshevsky
Michael G. Rabbat
58
509
0
26 Sep 2017
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Fabian Pedregosa
Rémi Leblond
Simon Lacoste-Julien
53
34
0
20 Jul 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Ce Zhang
Huan Zhang
Cho-Jui Hsieh
Wei Zhang
Ji Liu
50
1,227
0
25 May 2017
Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate
Aryan Mokhtari
Mert Gurbuzbalaban
Alejandro Ribeiro
94
36
0
01 Nov 2016
Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
129
54
0
18 Oct 2016
Optimization Methods for Large-Scale Machine Learning
Léon Bottou
Frank E. Curtis
J. Nocedal
233
3,206
0
15 Jun 2016
ASAGA: Asynchronous Parallel SAGA
Rémi Leblond
Fabian Pedregosa
Simon Lacoste-Julien
AI4TS
60
101
0
15 Jun 2016
Revisiting Distributed Synchronous SGD
Jianmin Chen
Xinghao Pan
R. Monga
Samy Bengio
Rafal Jozefowicz
87
799
0
04 Apr 2016
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Horia Mania
Xinghao Pan
Dimitris Papailiopoulos
Benjamin Recht
Kannan Ramchandran
Michael I. Jordan
89
232
0
24 Jul 2015
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Xiangru Lian
Yijun Huang
Y. Li
Ji Liu
135
499
0
27 Jun 2015
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms
Christopher De Sa
Ce Zhang
K. Olukotun
Christopher Ré
80
204
0
22 Jun 2015
ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates
Zhimin Peng
Yangyang Xu
Ming Yan
W. Yin
75
258
0
08 Jun 2015
An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
Hamid Reza Feyzmahdavian
Arda Aytekin
M. Johansson
51
117
0
18 May 2015
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
Mark Schmidt
Reza Babanezhad
Mohamed Osama Ahmed
Aaron Defazio
Ann Clifton
Anoop Sarkar
72
83
0
16 Apr 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems
Aaron Defazio
T. Caetano
Justin Domke
105
169
0
10 Jul 2014
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives
Aaron Defazio
Francis R. Bach
Simon Lacoste-Julien
ODL
131
1,822
0
01 Jul 2014
Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning
Julien Mairal
144
318
0
18 Feb 2014
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
Deanna Needell
Nathan Srebro
Rachel A. Ward
134
554
0
21 Oct 2013
Accelerated Mini-Batch Stochastic Dual Coordinate Ascent
Shai Shalev-Shwartz
Tong Zhang
ODL
101
150
0
12 May 2013
Parallel Coordinate Descent Methods for Big Data Optimization
Peter Richtárik
Martin Takáč
127
487
0
04 Dec 2012
A Reliable Effective Terascale Linear Learning System
Alekh Agarwal
O. Chapelle
Miroslav Dudík
John Langford
91
418
0
19 Oct 2011
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
191
2,273
0
28 Jun 2011
Parallel Coordinate Descent for L1-Regularized Loss Minimization
Joseph K. Bradley
Aapo Kyrola
Danny Bickson
Carlos Guestrin
97
309
0
26 May 2011
Distributed Delayed Stochastic Optimization
Alekh Agarwal
John C. Duchi
123
626
0
28 Apr 2011
Optimal Distributed Online Prediction using Mini-Batches
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
259
685
0
07 Dec 2010
Slow Learners are Fast
John Langford
Alex Smola
Martin A. Zinkevich
111
391
0
03 Nov 2009
1