ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1212.2002
  4. Cited By
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

10 December 2012
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
ArXivPDFHTML

Papers citing "A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method"

35 / 35 papers shown
Title
Some Primal-Dual Theory for Subgradient Methods for Strongly Convex Optimization
Some Primal-Dual Theory for Subgradient Methods for Strongly Convex Optimization
Benjamin Grimmer
Danlin Li
39
5
0
31 Dec 2024
Federated Cubic Regularized Newton Learning with Sparsification-amplified Differential Privacy
Federated Cubic Regularized Newton Learning with Sparsification-amplified Differential Privacy
Wei Huo
Changxin Liu
Kemi Ding
Karl H. Johansson
Ling Shi
FedML
35
0
0
08 Aug 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton
  Stepsizes
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
39
2
0
05 Jul 2024
Simplifying Deep Temporal Difference Learning
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
59
14
0
05 Jul 2024
On Adaptive Stochastic Optimization for Streaming Data: A Newton's
  Method with O(dN) Operations
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
32
3
0
29 Nov 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
W. Wu
27
3
0
13 Jul 2023
Resilient Distributed Optimization for Multi-Agent Cyberphysical Systems
Resilient Distributed Optimization for Multi-Agent Cyberphysical Systems
M. Yemini
Angelia Nedić
Andrea J. Goldsmith
Stephanie Gil
26
6
0
05 Dec 2022
Distributed DP-Helmet: Scalable Differentially Private Non-interactive
  Averaging of Single Layers
Distributed DP-Helmet: Scalable Differentially Private Non-interactive Averaging of Single Layers
Moritz Kirschte
Sebastian Meiser
Saman Ardalan
Esfandiar Mohammadi
FedML
29
0
0
03 Nov 2022
Two-Tailed Averaging: Anytime, Adaptive, Once-in-a-While Optimal Weight
  Averaging for Better Generalization
Two-Tailed Averaging: Anytime, Adaptive, Once-in-a-While Optimal Weight Averaging for Better Generalization
Gábor Melis
MoMe
26
1
0
26 Sep 2022
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax
  Optimization
Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization
Junchi Yang
Xiang Li
Niao He
ODL
27
22
0
01 Jun 2022
Do More Negative Samples Necessarily Hurt in Contrastive Learning?
Do More Negative Samples Necessarily Hurt in Contrastive Learning?
Pranjal Awasthi
Nishanth Dikkala
Pritish Kamath
30
40
0
03 May 2022
Tackling benign nonconvexity with smoothing and stochastic gradients
Tackling benign nonconvexity with smoothing and stochastic gradients
Harsh Vardhan
Sebastian U. Stich
20
8
0
18 Feb 2022
When is the Convergence Time of Langevin Algorithms Dimension
  Independent? A Composite Optimization Viewpoint
When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint
Y. Freund
Yi-An Ma
Tong Zhang
37
16
0
05 Oct 2021
On the Convergence of SGD with Biased Gradients
On the Convergence of SGD with Biased Gradients
Ahmad Ajalloeian
Sebastian U. Stich
6
83
0
31 Jul 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
29
48
0
16 Jun 2020
FedSplit: An algorithmic framework for fast federated optimization
FedSplit: An algorithmic framework for fast federated optimization
Reese Pathak
Martin J. Wainwright
FedML
37
182
0
11 May 2020
A Unified Theory of Decentralized SGD with Changing Topology and Local
  Updates
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
Anastasia Koloskova
Nicolas Loizou
Sadra Boreiri
Martin Jaggi
Sebastian U. Stich
FedML
39
491
0
23 Mar 2020
Iterative Averaging in the Quest for Best Test Error
Iterative Averaging in the Quest for Best Test Error
Diego Granziol
Xingchen Wan
Samuel Albanie
Stephen J. Roberts
8
3
0
02 Mar 2020
Unified Optimal Analysis of the (Stochastic) Gradient Method
Unified Optimal Analysis of the (Stochastic) Gradient Method
Sebastian U. Stich
21
112
0
09 Jul 2019
Reducing the variance in online optimization by transporting past
  gradients
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad
Ioannis Mitliagkas
Nicolas Le Roux
18
28
0
08 Jun 2019
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning
  Rate Procedure For Least Squares
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
Rong Ge
Sham Kakade
Rahul Kidambi
Praneeth Netrapalli
19
149
0
29 Apr 2019
Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
Giulia Denevi
C. Ciliberto
Riccardo Grazzi
Massimiliano Pontil
18
108
0
25 Mar 2019
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Tight Analyses for Non-Smooth Stochastic Gradient Descent
Nicholas J. A. Harvey
Christopher Liaw
Y. Plan
Sikander Randhawa
17
136
0
13 Dec 2018
Don't Use Large Mini-Batches, Use Local SGD
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
39
429
0
22 Aug 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
41
1,042
0
24 May 2018
Stochastic model-based minimization of weakly convex functions
Stochastic model-based minimization of weakly convex functions
Damek Davis
D. Drusvyatskiy
22
370
0
17 Mar 2018
Convergence of Online Mirror Descent
Convergence of Online Mirror Descent
Yunwen Lei
Ding-Xuan Zhou
23
20
0
18 Feb 2018
Safe Adaptive Importance Sampling
Safe Adaptive Importance Sampling
Sebastian U. Stich
Anant Raj
Martin Jaggi
27
53
0
07 Nov 2017
Memory and Communication Efficient Distributed Stochastic Optimization
  with Minibatch-Prox
Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox
Jialei Wang
Weiran Wang
Nathan Srebro
8
54
0
21 Feb 2017
Stochastic Optimization with Variance Reduction for Infinite Datasets
  with Finite-Sum Structure
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure
A. Bietti
Julien Mairal
36
36
0
04 Oct 2016
Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Zeyuan Allen-Zhu
ODL
15
575
0
18 Mar 2016
First-order Methods for Geodesically Convex Optimization
First-order Methods for Geodesically Convex Optimization
Hongyi Zhang
S. Sra
21
286
0
19 Feb 2016
RSG: Beating Subgradient Method without Smoothness and Strong Convexity
RSG: Beating Subgradient Method without Smoothness and Strong Convexity
Tianbao Yang
Qihang Lin
24
84
0
09 Dec 2015
A Variance Reduced Stochastic Newton Method
A Variance Reduced Stochastic Newton Method
Aurélien Lucchi
Brian McWilliams
Thomas Hofmann
ODL
23
49
0
28 Mar 2015
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
101
570
0
08 Dec 2012
1