Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,407 papers shown

Title
DTN: A Learning Rate Scheme with Convergence Rate of $\mathcal{O}(1/t)$ for SGD Lam M. Nguyen Phuong Ha Nguyen Dzung Phan Jayant Kalagnanam Marten van Dijk 33 0 0 22 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks Jinrong Guo Wantao Liu Wang Wang Q. Lu Songlin Hu Jizhong Han Ruixuan Li 16 9 0 21 Jan 2019
Tuning parameter selection rules for nuclear norm regularized multivariate linear regression Pan Shang Lingchen Kong 23 1 0 19 Jan 2019
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization Sattar Vakili Sudeep Salgia Qing Zhao 17 7 0 17 Jan 2019
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization Xiao Fu Shahana Ibrahim Hoi-To Wai Cheng Gao Kejun Huang 17 37 0 16 Jan 2019
Optimization Problems for Machine Learning: A Survey Claudio Gambella Bissan Ghaddar Joe Naoum-Sawaya AI4CE 30 178 0 16 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers A. Koliousis Pijika Watcharapichat Matthias Weidlich Luo Mai Paolo Costa Peter R. Pietzuch 13 69 0 08 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path Yi Zhou Junjie Yang Huishuai Zhang Yingbin Liang Vahid Tarokh 14 71 0 02 Jan 2019
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis S. Fattahi Somayeh Sojoudi 14 38 0 30 Dec 2018
On Lazy Training in Differentiable Programming Lénaïc Chizat Edouard Oyallon Francis R. Bach 46 806 0 19 Dec 2018
A stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs R. Kannan James R. Luedtke 8 4 0 17 Dec 2018
An Empirical Model of Large-Batch Training Sam McCandlish Jared Kaplan Dario Amodei OpenAI Dota Team 13 268 0 14 Dec 2018
Gradient Descent Happens in a Tiny Subspace Guy Gur-Ari Daniel A. Roberts Ethan Dyer 30 228 0 12 Dec 2018
Layer-Parallel Training of Deep Residual Neural Networks Stefanie Günther Lars Ruthotto J. Schroder E. Cyr N. Gauger 17 90 0 11 Dec 2018
A probabilistic incremental proximal gradient method Ömer Deniz Akyildiz Émilie Chouzenoux Victor Elvira Joaquín Míguez 8 3 0 04 Dec 2018
Image-based model parameter optimization using Model-Assisted Generative Adversarial Networks Saúl Alonso-Monsalve L. Whitehead GAN 14 30 0 30 Nov 2018
Universal Adversarial Training A. Mendrik Mahyar Najibi Zheng Xu John P. Dickerson L. Davis Tom Goldstein AAML OOD 16 189 0 27 Nov 2018
Forward Stability of ResNet and Its Variants Linan Zhang Hayden Schaeffer 30 47 0 24 Nov 2018
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization Ömer Deniz Akyildiz Dan Crisan Joaquín Míguez 15 5 0 23 Nov 2018
A Sufficient Condition for Convergences of Adam and RMSProp Fangyu Zou Li Shen Zequn Jie Weizhong Zhang Wei Liu 33 364 0 23 Nov 2018
Distributed Gradient Descent with Coded Partial Gradient Computations Emre Ozfatura S. Ulukus Deniz Gunduz 19 40 0 22 Nov 2018
New Convergence Aspects of Stochastic Gradient Algorithms Lam M. Nguyen Phuong Ha Nguyen Peter Richtárik K. Scheinberg Martin Takáč Marten van Dijk 23 66 0 10 Nov 2018
A Bayesian Perspective of Statistical Machine Learning for Big Data R. Sambasivan Sourish Das S. Sahu BDL GP 14 19 0 09 Nov 2018
Double Adaptive Stochastic Gradient Optimization Rajaditya Mukherjee Jin Li Shicheng Chu Huamin Wang ODL 24 0 0 06 Nov 2018
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent Avetik G. Karagulyan 11 1 0 02 Nov 2018
Functional Nonlinear Sparse Models Luiz F. O. Chamon Yonina C. Eldar Alejandro Ribeiro 11 11 0 01 Nov 2018
A general system of differential equations to model first order adaptive algorithms André Belotto da Silva Maxime Gazeau 11 33 0 31 Oct 2018
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization James Vuckovic ODL 16 15 0 29 Oct 2018
SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms Zhe Wang Kaiyi Ji Yi Zhou Yingbin Liang Vahid Tarokh ODL 35 81 0 25 Oct 2018
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods R. Freund Paul Grigas Rahul Mazumder 20 10 0 20 Oct 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD Jianyu Wang Gauri Joshi FedML 33 231 0 19 Oct 2018
First-order and second-order variants of the gradient descent in a unified framework Thomas Pierrot Nicolas Perrin Olivier Sigaud ODL 30 7 0 18 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning Aurick Qiao Bryon Aragam Bingjing Zhang Eric Xing 26 41 0 17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks Xiaodong Cui Wei Zhang Zoltán Tüske M. Picheny ODL 16 89 0 16 Oct 2018
Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks Zhibin Liao Tom Drummond Ian Reid G. Carneiro 14 21 0 16 Oct 2018
Deep Reinforcement Learning Yuxi Li VLM OffRL 28 144 0 15 Oct 2018
Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD Phuong Ha Nguyen Lam M. Nguyen Marten van Dijk LRM 12 31 0 10 Oct 2018
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD Marten van Dijk Lam M. Nguyen Phuong Ha Nguyen Dzung Phan 36 6 0 09 Oct 2018
Information Geometry of Orthogonal Initializations and Training Piotr A. Sokól Il-Su Park AI4CE 77 16 0 09 Oct 2018
Principled Deep Neural Network Training through Linear Programming D. Bienstock Gonzalo Muñoz Sebastian Pokutta 35 24 0 07 Oct 2018
Accelerating Stochastic Gradient Descent Using Antithetic Sampling Jingchang Liu Linli Xu 19 2 0 07 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms Antonio Orvieto Aurelien Lucchi 19 31 0 05 Oct 2018
Combining Natural Gradient with Hessian Free Methods for Sequence Training Adnan Haider P. Woodland ODL 20 4 0 03 Oct 2018
Large batch size training of neural networks with adversarial training and second-order information Z. Yao A. Gholami Daiyaan Arfeen Richard Liaw Joseph E. Gonzalez Kurt Keutzer Michael W. Mahoney ODL 6 42 0 02 Oct 2018
Privacy-preserving Stochastic Gradual Learning Bo Han Ivor W. Tsang Xiaokui Xiao Ling-Hao Chen S. Fung C. Yu NoLa 8 8 0 30 Sep 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse Sangkug Lym Armand Behroozi W. Wen Ge Li Yongkee Kwon M. Erez 12 25 0 30 Sep 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation A. Wills Carl Jidling Thomas B. Schon ODL 28 7 0 29 Sep 2018
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent Yongqiang Cai Qianxiao Li Zuowei Shen 14 3 0 29 Sep 2018
Fluctuation-dissipation relations for stochastic gradient descent Sho Yaida 32 73 0 28 Sep 2018
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview Yuejie Chi Yue M. Lu Yuxin Chen 39 416 0 25 Sep 2018