Optimal Distributed Online Prediction using Mini-Batches

7 December 2010

Papers citing "Optimal Distributed Online Prediction using Mini-Batches"

47 / 97 papers shown

Title
Effective Parallelisation for Machine Learning Michael Kamp Mario Boley Olana Missura Thomas Gärtner 11 12 0 08 Oct 2018
Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers Nuwan S. Ferdinand S. Draper 13 19 0 06 Oct 2018
Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent Dominic Richards Patrick Rebeschini 16 18 0 18 Sep 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms Jianyu Wang Gauri Joshi 13 348 0 22 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD Tao R. Lin Sebastian U. Stich Kumar Kshitij Patel Martin Jaggi 48 429 0 22 Aug 2018
Parallelization does not Accelerate Convex Optimization: Adaptivity Lower Bounds for Non-smooth Convex Minimization Eric Balkanski Yaron Singer 16 31 0 12 Aug 2018
Efficient Decentralized Deep Learning by Dynamic Model Averaging Michael Kamp Linara Adilova Joachim Sicking Fabian Hüger Peter Schlicht Tim Wirtz Stefan Wrobel 27 128 0 09 Jul 2018
The Effect of Network Width on the Performance of Large-batch Training Lingjiao Chen Hongyi Wang Jinman Zhao Dimitris Papailiopoulos Paraschos Koutris 10 22 0 11 Jun 2018
Local SGD Converges Fast and Communicates Little Sebastian U. Stich FedML 46 1,043 0 24 May 2018
Stochastic modified equations for the asynchronous stochastic gradient descent Jing An Jian-wei Lu Lexing Ying 21 79 0 21 May 2018
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes Xiaoyun Li Francesco Orabona 32 290 0 21 May 2018
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD Sanghamitra Dutta Gauri Joshi Soumyadip Ghosh Parijat Dube P. Nagpurkar 12 193 0 03 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Tal Ben-Nun Torsten Hoefler GNN 30 701 0 26 Feb 2018
Online Learning: A Comprehensive Survey S. Hoi Doyen Sahoo Jing Lu P. Zhao OffRL 27 630 0 08 Feb 2018
Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling Qi Meng Wei-neng Chen Yue Wang Zhi-Ming Ma Tie-Yan Liu FedML 16 101 0 29 Sep 2017
Stochastic Nonconvex Optimization with Large Minibatches Weiran Wang Nathan Srebro 36 26 0 25 Sep 2017
On the convergence properties of a $K$ -step averaging stochastic gradient descent algorithm for nonconvex optimization Fan Zhou Guojing Cong 32 232 0 03 Aug 2017
Stochastic Optimization from Distributed, Streaming Data in Rate-limited Networks M. Nokleby W. Bajwa 13 16 0 25 Apr 2017
Stochastic Composite Least-Squares Regression with convergence rate O(1/n) Nicolas Flammarion Francis R. Bach 19 27 0 21 Feb 2017
Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox Jialei Wang Weiran Wang Nathan Srebro 10 54 0 21 Feb 2017
Optimization for Large-Scale Machine Learning with Distributed Features and Observations A. Nathan Diego Klabjan 22 13 0 31 Oct 2016
Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server Arda Aytekin Hamid Reza Feyzmahdavian M. Johansson 14 54 0 18 Oct 2016
Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification Prateek Jain Sham Kakade Rahul Kidambi Praneeth Netrapalli Aaron Sidford MoMe 13 36 0 12 Oct 2016
Federated Optimization: Distributed Machine Learning for On-Device Intelligence Jakub Konecný H. B. McMahan Daniel Ramage Peter Richtárik FedML 22 1,876 0 08 Oct 2016
Distributed learning with regularized least squares Shaobo Lin Xin Guo Ding-Xuan Zhou 35 190 0 11 Aug 2016
Bootstrap Model Aggregation for Distributed Statistical Learning J. Han Qiang Liu FedML 13 8 0 04 Jul 2016
Parallel SGD: When does averaging help? Jian Zhang Christopher De Sa Ioannis Mitliagkas Christopher Ré MoMe FedML 46 109 0 23 Jun 2016
Alternative asymptotics for cointegration tests in large VARs Junhong Lin Lorenzo Rosasco 20 43 0 28 May 2016
Accelerating Deep Neural Network Training with Inconsistent Stochastic Gradient Descent Linnan Wang Yi Yang Martin Renqiang Min S. Chakradhar 13 91 0 17 Mar 2016
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization Xiangru Lian Yijun Huang Y. Li Ji Liu 25 498 0 27 Jun 2015
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants Sashank J. Reddi Ahmed S. Hefny S. Sra Barnabás Póczós Alex Smola 30 194 0 23 Jun 2015
Communication Complexity of Distributed Convex Learning and Optimization Yossi Arjevani Ohad Shamir 29 205 0 05 Jun 2015
Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting Jakub Konecný Jie Liu Peter Richtárik Martin Takáč ODL 20 273 0 16 Apr 2015
Communication-efficient sparse regression: a one-shot approach J. Lee Yuekai Sun Qiang Liu Jonathan E. Taylor 35 65 0 14 Mar 2015
Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss Yuchen Zhang Lin Xiao 33 72 0 01 Jan 2015
Online and Stochastic Gradient Methods for Non-decomposable Loss Functions Purushottam Kar Harikrishna Narasimhan Prateek Jain 56 71 0 24 Oct 2014
Median Selection Subset Aggregation for Parallel Inference Xiangyu Wang Peichao Peng David B. Dunson 36 23 0 24 Oct 2014
Distributed Detection : Finite-time Analysis and Impact of Network Topology Shahin Shahrampour Alexander Rakhlin Ali Jadbabaie 49 114 0 30 Sep 2014
Communication-Efficient Distributed Dual Coordinate Ascent Martin Jaggi Virginia Smith Martin Takáč Jonathan Terhorst S. Krishnan Thomas Hofmann Michael I. Jordan 24 353 0 04 Sep 2014
Exploiting Smoothness in Statistical Learning, Sequential Prediction, and Stochastic Optimization M. Mahdavi 57 4 0 19 Jul 2014
A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning A. Bellet Yingyu Liang A. Garakani Maria-Florina Balcan Fei Sha FedML 30 49 0 09 Apr 2014
Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation Ohad Shamir 58 108 0 14 Nov 2013
Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging Shahin Shahrampour Ali Jadbabaie FedML 51 76 0 10 Sep 2013
MixedGrad: An O(1/T) Convergence Rate Algorithm for Stochastic Smooth Optimization M. Mahdavi R. L. Jin 53 17 0 26 Jul 2013
Mini-Batch Primal and Dual Methods for SVMs Martin Takáč A. Bijral Peter Richtárik Nathan Srebro 28 194 0 10 Mar 2013
Online Alternating Direction Method Huahua Wang Arindam Banerjee 56 165 0 27 Jun 2012
Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by Exploiting Structure H. Ouyang Alexander G. Gray 43 28 0 21 May 2012