Advances in Asynchronous Parallel and Distributed Optimization

24 June 2020

By Mahmoud Assran

Arda Aytekin

Hamid Reza Feyzmahdavian

M. Johansson

Michael G. Rabbat

ArXiv PDF HTML

Papers citing "Advances in Asynchronous Parallel and Distributed Optimization"

45 / 45 papers shown

Title
Asynchronous Stochastic Gradient Descent with Decoupled Backpropagation and Layer-Wise Updates Cabrel Teguemne Fokam Khaleelulla Khan Nazeer Lukas König David Kappel Anand Subramoney 56 0 0 08 Oct 2024
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems Filip Hanzely D. Kovalev Peter Richtárik 70 17 0 11 Feb 2020
Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates D. Kovalev Konstantin Mishchenko Peter Richtárik ODL 51 45 0 03 Dec 2019
SySCD: A System-Aware Parallel Coordinate Descent Algorithm Nikolas Ioannou Celestine Mendler-Dünner Thomas Parnell 100 3 0 18 Nov 2019
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum Jianyu Wang Vinayak Tantia Nicolas Ballas Michael G. Rabbat 58 201 0 01 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 612 24,431 0 26 Jul 2019
Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning Shi Pu Alexander Olshevsky I. Paschalidis 53 41 0 28 Jun 2019
On Linear Learning with Manycore Processors Eliza Wszola Celestine Mendler-Dünner Martin Jaggi Markus Püschel 42 1 0 02 May 2019
An Asynchronous, Decentralized Solution Framework for the Large Scale Unit Commitment Problem P. Ramanan Murat Yildirim Edmond Chow N. Gebraeel 133 22 0 07 Apr 2019
Measuring scheduling efficiency of RNNs for NLP applications Urmish Thakker Ganesh S. Dasika Jesse G. Beu Matthew Mattina 50 13 0 05 Apr 2019
Stochastic Gradient Push for Distributed Deep Learning Mahmoud Assran Nicolas Loizou Nicolas Ballas Michael G. Rabbat 76 345 0 27 Nov 2018
SEGA: Variance Reduction via Gradient Sketching Filip Hanzely Konstantin Mishchenko Peter Richtárik 50 71 0 09 Sep 2018
AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs Jiaqi Zhang Keyou You 36 74 0 13 Aug 2018
Exploring the Limits of Weakly Supervised Pretraining D. Mahajan Ross B. Girshick Vignesh Ramanathan Kaiming He Manohar Paluri Yixuan Li Ashwin R. Bharambe Laurens van der Maaten VLM 180 1,367 0 02 May 2018
Asynchronous Gradient-Push Mahmoud Assran Michael G. Rabbat 51 64 0 23 Mar 2018
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption Lam M. Nguyen Phuong Ha Nguyen Marten van Dijk Peter Richtárik K. Scheinberg Martin Takáč 66 228 0 11 Feb 2018
Asynchronous Decentralized Parallel Stochastic Gradient Descent Xiangru Lian Wei Zhang Ce Zhang Ji Liu ODL 46 500 0 18 Oct 2017
Network Topology and Communication-Computation Tradeoffs in Decentralized Optimization A. Nedić Alexander Olshevsky Michael G. Rabbat 58 509 0 26 Sep 2017
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization Fabian Pedregosa Rémi Leblond Simon Lacoste-Julien 53 34 0 20 Jul 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent Xiangru Lian Ce Zhang Huan Zhang Cho-Jui Hsieh Wei Zhang Ji Liu 50 1,227 0 25 May 2017
Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate Aryan Mokhtari Mert Gurbuzbalaban Alejandro Ribeiro 94 36 0 01 Nov 2016
Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server Arda Aytekin Hamid Reza Feyzmahdavian M. Johansson 129 54 0 18 Oct 2016
Optimization Methods for Large-Scale Machine Learning Léon Bottou Frank E. Curtis J. Nocedal 233 3,206 0 15 Jun 2016
ASAGA: Asynchronous Parallel SAGA Rémi Leblond Fabian Pedregosa Simon Lacoste-Julien AI4TS 60 101 0 15 Jun 2016
Revisiting Distributed Synchronous SGD Jianmin Chen Xinghao Pan R. Monga Samy Bengio Rafal Jozefowicz 87 799 0 04 Apr 2016
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization Horia Mania Xinghao Pan Dimitris Papailiopoulos Benjamin Recht Kannan Ramchandran Michael I. Jordan 89 232 0 24 Jul 2015
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization Xiangru Lian Yijun Huang Y. Li Ji Liu 135 499 0 27 Jun 2015
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms Christopher De Sa Ce Zhang K. Olukotun Christopher Ré 80 204 0 22 Jun 2015
ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates Zhimin Peng Yangyang Xu Ming Yan W. Yin 75 258 0 08 Jun 2015
An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization Hamid Reza Feyzmahdavian Arda Aytekin M. Johansson 51 117 0 18 May 2015
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields Mark Schmidt Reza Babanezhad Mohamed Osama Ahmed Aaron Defazio Ann Clifton Anoop Sarkar 72 83 0 16 Apr 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.8K 150,039 0 22 Dec 2014
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 1.7K 39,525 0 01 Sep 2014
Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems Aaron Defazio T. Caetano Justin Domke 105 169 0 10 Jul 2014
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives Aaron Defazio Francis R. Bach Simon Lacoste-Julien ODL 131 1,822 0 01 Jul 2014
Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning Julien Mairal 144 318 0 18 Feb 2014
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm Deanna Needell Nathan Srebro Rachel A. Ward 134 554 0 21 Oct 2013
Accelerated Mini-Batch Stochastic Dual Coordinate Ascent Shai Shalev-Shwartz Tong Zhang ODL 101 150 0 12 May 2013
Parallel Coordinate Descent Methods for Big Data Optimization Peter Richtárik Martin Takáč 127 487 0 04 Dec 2012
A Reliable Effective Terascale Linear Learning System Alekh Agarwal O. Chapelle Miroslav Dudík John Langford 91 418 0 19 Oct 2011
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent Feng Niu Benjamin Recht Christopher Ré Stephen J. Wright 191 2,273 0 28 Jun 2011
Parallel Coordinate Descent for L1-Regularized Loss Minimization Joseph K. Bradley Aapo Kyrola Danny Bickson Carlos Guestrin 97 309 0 26 May 2011
Distributed Delayed Stochastic Optimization Alekh Agarwal John C. Duchi 123 626 0 28 Apr 2011
Optimal Distributed Online Prediction using Mini-Batches O. Dekel Ran Gilad-Bachrach Ohad Shamir Lin Xiao 259 685 0 07 Dec 2010
Slow Learners are Fast John Langford Alex Smola Martin A. Zinkevich 111 391 0 03 Nov 2009