Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.05953
Cited By
v1
v2
v3
v4
v5
v6 (latest)
Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
18 March 2016
Zeyuan Allen-Zhu
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Katyusha: The First Direct Acceleration of Stochastic Gradient Methods"
50 / 192 papers shown
Title
Towards Better Generalization: BP-SVRG in Training Deep Neural Networks
Hao Jin
Dachao Lin
Zhihua Zhang
ODL
55
2
0
18 Aug 2019
A Data Efficient and Feasible Level Set Method for Stochastic Convex Optimization with Expectation Constraints
Qihang Lin
Selvaprabu Nadarajah
Negar Soheili
Tianbao Yang
101
13
0
07 Aug 2019
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
270
736
0
19 Jul 2019
A Hybrid Stochastic Optimization Framework for Stochastic Composite Nonconvex Optimization
Quoc Tran-Dinh
Nhan H. Pham
T. Dzung
Lam M. Nguyen
80
51
0
08 Jul 2019
Variance Reduction for Matrix Games
Y. Carmon
Yujia Jin
Aaron Sidford
Kevin Tian
94
67
0
03 Jul 2019
Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses
Ulysse Marteau-Ferey
Francis R. Bach
Alessandro Rudi
67
36
0
03 Jul 2019
The Role of Memory in Stochastic Optimization
Antonio Orvieto
Jonas Köhler
Aurelien Lucchi
104
31
0
02 Jul 2019
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond
Oliver Hinder
Aaron Sidford
N. Sohoni
85
72
0
27 Jun 2019
Submodular Batch Selection for Training Deep Neural Networks
K. J. Joseph
R. VamshiTeja
Krishnakant Singh
V. Balasubramanian
75
24
0
20 Jun 2019
A Generic Acceleration Framework for Stochastic Composite Optimization
A. Kulunchakov
Julien Mairal
111
44
0
03 Jun 2019
Unified Acceleration of High-Order Algorithms under Hölder Continuity and Uniform Convexity
Chaobing Song
Yong Jiang
Yi Ma
368
19
0
03 Jun 2019
On the computational complexity of the probabilistic label tree algorithms
R. Busa-Fekete
Krzysztof Dembczyñski
Alexander Golovnev
Kalina Jasinska
Mikhail Kuznetsov
M. Sviridenko
Chao Xu
TPM
55
3
0
01 Jun 2019
Convergence of Distributed Stochastic Variance Reduced Methods without Sampling Extra Data
Shicong Cen
Huishuai Zhang
Yuejie Chi
Wei-neng Chen
Tie-Yan Liu
FedML
111
27
0
29 May 2019
A unified variance-reduced accelerated gradient method for convex optimization
Guanghui Lan
Zhize Li
Yi Zhou
75
61
0
29 May 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J.N. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
90
471
0
28 May 2019
One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods
Filip Hanzely
Peter Richtárik
101
27
0
27 May 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Sharan Vaswani
Aaron Mishkin
I. Laradji
Mark Schmidt
Gauthier Gidel
Simon Lacoste-Julien
ODL
121
210
0
24 May 2019
Hybrid Stochastic Gradient Descent Algorithms for Stochastic Nonconvex Optimization
Quoc Tran-Dinh
Nhan H. Pham
Dzung Phan
Lam M. Nguyen
86
56
0
15 May 2019
Solving Empirical Risk Minimization in the Current Matrix Multiplication Time
Y. Lee
Zhao Song
Qiuyi Zhang
113
117
0
11 May 2019
Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources
Yanghua Peng
Hang Zhang
Yifei Ma
Tong He
Zhi-Li Zhang
Sheng Zha
Mu Li
50
23
0
26 Apr 2019
On Structured Filtering-Clustering: Global Error Bound and Optimal First-Order Algorithms
Nhat Ho
Tianyi Lin
Michael I. Jordan
128
2
0
16 Apr 2019
On the Adaptivity of Stochastic Gradient-Based Optimization
Lihua Lei
Michael I. Jordan
ODL
100
22
0
09 Apr 2019
Cocoercivity, Smoothness and Bias in Variance-Reduced Stochastic Gradient Methods
Martin Morin
Pontus Giselsson
59
2
0
21 Mar 2019
Noisy Accelerated Power Method for Eigenproblems with Applications
Vien V. Mai
M. Johansson
32
3
0
20 Mar 2019
ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization
Nhan H. Pham
Lam M. Nguyen
Dzung Phan
Quoc Tran-Dinh
87
141
0
15 Feb 2019
Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization
Yi Zhou
Zhe Wang
Kaiyi Ji
Yingbin Liang
Vahid Tarokh
ODL
82
14
0
07 Feb 2019
Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions
Adrien B. Taylor
Francis R. Bach
79
64
0
03 Feb 2019
Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Yunwen Lei
Ting Hu
Guiying Li
K. Tang
MLT
110
119
0
03 Feb 2019
Lower Bounds for Smooth Nonconvex Finite-Sum Optimization
Dongruo Zhou
Quanquan Gu
91
45
0
31 Jan 2019
Asynchronous Accelerated Proximal Stochastic Gradient for Strongly Convex Distributed Finite Sums
Aymeric Dieuleveut
Francis R. Bach
Laurent Massoulié
FedML
67
26
0
28 Jan 2019
Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise
A. Kulunchakov
Julien Mairal
111
45
0
25 Jan 2019
Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop
D. Kovalev
Samuel Horváth
Peter Richtárik
129
156
0
24 Jan 2019
Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning
Vinod Kumar Chauhan
A. Sharma
Kalpana Dahiya
36
6
0
26 Dec 2018
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
Aaron Defazio
Léon Bottou
UQCV
DRL
98
113
0
11 Dec 2018
Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression
Neha Gupta
Aaron Sidford
118
12
0
27 Nov 2018
R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate
J.N. Zhang
Hongyi Zhang
S. Sra
87
39
0
10 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark Schmidt
130
301
0
16 Oct 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
167
130
0
16 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
119
32
0
05 Oct 2018
Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning
Adithya M. Devraj
Ana Bušić
Sean P. Meyn
128
4
0
17 Sep 2018
SEGA: Variance Reduction via Gradient Sketching
Filip Hanzely
Konstantin Mishchenko
Peter Richtárik
92
71
0
09 Sep 2018
Online Adaptive Methods, Universality and Acceleration
Kfir Y. Levy
A. Yurtsever
Volkan Cevher
ODL
81
93
0
08 Sep 2018
A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization
Zhize Li
Jian Li
97
19
0
07 Sep 2018
Stochastically Controlled Stochastic Gradient for the Convex and Non-convex Composition problem
Liu Liu
Ji Liu
Cho-Jui Hsieh
Dacheng Tao
72
13
0
06 Sep 2018
Direct Acceleration of SAGA using Sampled Negative Momentum
Kaiwen Zhou
106
45
0
28 Jun 2018
A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates
Kaiwen Zhou
Fanhua Shang
James Cheng
97
75
0
28 Jun 2018
Stochastic Nested Variance Reduction for Nonconvex Optimization
Dongruo Zhou
Pan Xu
Quanquan Gu
96
147
0
20 Jun 2018
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
113
43
0
17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
Atsushi Nitanda
Taiji Suzuki
77
10
0
14 Jun 2018
Double Quantization for Communication-Efficient Distributed Optimization
Yue Yu
Jiaxiang Wu
Longbo Huang
MQ
91
57
0
25 May 2018
Previous
1
2
3
4
Next