ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1012.1367
  4. Cited By
Optimal Distributed Online Prediction using Mini-Batches

Optimal Distributed Online Prediction using Mini-Batches

7 December 2010
O. Dekel
Ran Gilad-Bachrach
Ohad Shamir
Lin Xiao
ArXivPDFHTML

Papers citing "Optimal Distributed Online Prediction using Mini-Batches"

47 / 97 papers shown
Title
Effective Parallelisation for Machine Learning
Effective Parallelisation for Machine Learning
Michael Kamp
Mario Boley
Olana Missura
Thomas Gärtner
11
12
0
08 Oct 2018
Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers
Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers
Nuwan S. Ferdinand
S. Draper
13
19
0
06 Oct 2018
Graph-Dependent Implicit Regularisation for Distributed Stochastic
  Subgradient Descent
Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent
Dominic Richards
Patrick Rebeschini
16
18
0
18 Sep 2018
Cooperative SGD: A unified Framework for the Design and Analysis of
  Communication-Efficient SGD Algorithms
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms
Jianyu Wang
Gauri Joshi
13
348
0
22 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
48
429
0
22 Aug 2018
Parallelization does not Accelerate Convex Optimization: Adaptivity
  Lower Bounds for Non-smooth Convex Minimization
Parallelization does not Accelerate Convex Optimization: Adaptivity Lower Bounds for Non-smooth Convex Minimization
Eric Balkanski
Yaron Singer
16
31
0
12 Aug 2018
Efficient Decentralized Deep Learning by Dynamic Model Averaging
Efficient Decentralized Deep Learning by Dynamic Model Averaging
Michael Kamp
Linara Adilova
Joachim Sicking
Fabian Hüger
Peter Schlicht
Tim Wirtz
Stefan Wrobel
27
128
0
09 Jul 2018
The Effect of Network Width on the Performance of Large-batch Training
The Effect of Network Width on the Performance of Large-batch Training
Lingjiao Chen
Hongyi Wang
Jinman Zhao
Dimitris Papailiopoulos
Paraschos Koutris
10
22
0
11 Jun 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
46
1,043
0
24 May 2018
Stochastic modified equations for the asynchronous stochastic gradient
  descent
Stochastic modified equations for the asynchronous stochastic gradient descent
Jing An
Jian-wei Lu
Lexing Ying
21
79
0
21 May 2018
On the Convergence of Stochastic Gradient Descent with Adaptive
  Stepsizes
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
32
290
0
21 May 2018
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in
  Distributed SGD
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD
Sanghamitra Dutta
Gauri Joshi
Soumyadip Ghosh
Parijat Dube
P. Nagpurkar
12
193
0
03 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth
  Concurrency Analysis
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
30
701
0
26 Feb 2018
Online Learning: A Comprehensive Survey
Online Learning: A Comprehensive Survey
S. Hoi
Doyen Sahoo
Jing Lu
P. Zhao
OffRL
27
630
0
08 Feb 2018
Convergence Analysis of Distributed Stochastic Gradient Descent with
  Shuffling
Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling
Qi Meng
Wei-neng Chen
Yue Wang
Zhi-Ming Ma
Tie-Yan Liu
FedML
16
101
0
29 Sep 2017
Stochastic Nonconvex Optimization with Large Minibatches
Stochastic Nonconvex Optimization with Large Minibatches
Weiran Wang
Nathan Srebro
36
26
0
25 Sep 2017
On the convergence properties of a $K$-step averaging stochastic
  gradient descent algorithm for nonconvex optimization
On the convergence properties of a KKK-step averaging stochastic gradient descent algorithm for nonconvex optimization
Fan Zhou
Guojing Cong
32
232
0
03 Aug 2017
Stochastic Optimization from Distributed, Streaming Data in Rate-limited
  Networks
Stochastic Optimization from Distributed, Streaming Data in Rate-limited Networks
M. Nokleby
W. Bajwa
13
16
0
25 Apr 2017
Stochastic Composite Least-Squares Regression with convergence rate
  O(1/n)
Stochastic Composite Least-Squares Regression with convergence rate O(1/n)
Nicolas Flammarion
Francis R. Bach
19
27
0
21 Feb 2017
Memory and Communication Efficient Distributed Stochastic Optimization
  with Minibatch-Prox
Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox
Jialei Wang
Weiran Wang
Nathan Srebro
10
54
0
21 Feb 2017
Optimization for Large-Scale Machine Learning with Distributed Features
  and Observations
Optimization for Large-Scale Machine Learning with Distributed Features and Observations
A. Nathan
Diego Klabjan
22
13
0
31 Oct 2016
Analysis and Implementation of an Asynchronous Optimization Algorithm
  for the Parameter Server
Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
14
54
0
18 Oct 2016
Parallelizing Stochastic Gradient Descent for Least Squares Regression:
  mini-batching, averaging, and model misspecification
Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification
Prateek Jain
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
Aaron Sidford
MoMe
13
36
0
12 Oct 2016
Federated Optimization: Distributed Machine Learning for On-Device
  Intelligence
Federated Optimization: Distributed Machine Learning for On-Device Intelligence
Jakub Konecný
H. B. McMahan
Daniel Ramage
Peter Richtárik
FedML
22
1,876
0
08 Oct 2016
Distributed learning with regularized least squares
Distributed learning with regularized least squares
Shaobo Lin
Xin Guo
Ding-Xuan Zhou
35
190
0
11 Aug 2016
Bootstrap Model Aggregation for Distributed Statistical Learning
Bootstrap Model Aggregation for Distributed Statistical Learning
J. Han
Qiang Liu
FedML
13
8
0
04 Jul 2016
Parallel SGD: When does averaging help?
Parallel SGD: When does averaging help?
Jian Zhang
Christopher De Sa
Ioannis Mitliagkas
Christopher Ré
MoMe
FedML
46
109
0
23 Jun 2016
Alternative asymptotics for cointegration tests in large VARs
Alternative asymptotics for cointegration tests in large VARs
Junhong Lin
Lorenzo Rosasco
20
43
0
28 May 2016
Accelerating Deep Neural Network Training with Inconsistent Stochastic
  Gradient Descent
Accelerating Deep Neural Network Training with Inconsistent Stochastic Gradient Descent
Linnan Wang
Yi Yang
Martin Renqiang Min
S. Chakradhar
13
91
0
17 Mar 2016
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
Xiangru Lian
Yijun Huang
Y. Li
Ji Liu
25
498
0
27 Jun 2015
On Variance Reduction in Stochastic Gradient Descent and its
  Asynchronous Variants
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
Sashank J. Reddi
Ahmed S. Hefny
S. Sra
Barnabás Póczós
Alex Smola
30
194
0
23 Jun 2015
Communication Complexity of Distributed Convex Learning and Optimization
Communication Complexity of Distributed Convex Learning and Optimization
Yossi Arjevani
Ohad Shamir
29
205
0
05 Jun 2015
Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting
Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting
Jakub Konecný
Jie Liu
Peter Richtárik
Martin Takáč
ODL
20
273
0
16 Apr 2015
Communication-efficient sparse regression: a one-shot approach
Communication-efficient sparse regression: a one-shot approach
J. Lee
Yuekai Sun
Qiang Liu
Jonathan E. Taylor
35
65
0
14 Mar 2015
Communication-Efficient Distributed Optimization of Self-Concordant
  Empirical Loss
Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss
Yuchen Zhang
Lin Xiao
33
72
0
01 Jan 2015
Online and Stochastic Gradient Methods for Non-decomposable Loss
  Functions
Online and Stochastic Gradient Methods for Non-decomposable Loss Functions
Purushottam Kar
Harikrishna Narasimhan
Prateek Jain
56
71
0
24 Oct 2014
Median Selection Subset Aggregation for Parallel Inference
Median Selection Subset Aggregation for Parallel Inference
Xiangyu Wang
Peichao Peng
David B. Dunson
36
23
0
24 Oct 2014
Distributed Detection : Finite-time Analysis and Impact of Network
  Topology
Distributed Detection : Finite-time Analysis and Impact of Network Topology
Shahin Shahrampour
Alexander Rakhlin
Ali Jadbabaie
49
114
0
30 Sep 2014
Communication-Efficient Distributed Dual Coordinate Ascent
Communication-Efficient Distributed Dual Coordinate Ascent
Martin Jaggi
Virginia Smith
Martin Takáč
Jonathan Terhorst
S. Krishnan
Thomas Hofmann
Michael I. Jordan
24
353
0
04 Sep 2014
Exploiting Smoothness in Statistical Learning, Sequential Prediction,
  and Stochastic Optimization
Exploiting Smoothness in Statistical Learning, Sequential Prediction, and Stochastic Optimization
M. Mahdavi
57
4
0
19 Jul 2014
A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse
  Learning
A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning
A. Bellet
Yingyu Liang
A. Garakani
Maria-Florina Balcan
Fei Sha
FedML
30
49
0
09 Apr 2014
Fundamental Limits of Online and Distributed Algorithms for Statistical
  Learning and Estimation
Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation
Ohad Shamir
58
108
0
14 Nov 2013
Exponentially Fast Parameter Estimation in Networks Using Distributed
  Dual Averaging
Exponentially Fast Parameter Estimation in Networks Using Distributed Dual Averaging
Shahin Shahrampour
Ali Jadbabaie
FedML
51
76
0
10 Sep 2013
MixedGrad: An O(1/T) Convergence Rate Algorithm for Stochastic Smooth
  Optimization
MixedGrad: An O(1/T) Convergence Rate Algorithm for Stochastic Smooth Optimization
M. Mahdavi
R. L. Jin
53
17
0
26 Jul 2013
Mini-Batch Primal and Dual Methods for SVMs
Mini-Batch Primal and Dual Methods for SVMs
Martin Takáč
A. Bijral
Peter Richtárik
Nathan Srebro
28
194
0
10 Mar 2013
Online Alternating Direction Method
Online Alternating Direction Method
Huahua Wang
Arindam Banerjee
56
165
0
27 Jun 2012
Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by
  Exploiting Structure
Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by Exploiting Structure
H. Ouyang
Alexander G. Gray
43
28
0
21 May 2012
Previous
12