v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 867 papers shown

Title
Finite-Sum Smooth Optimization with SARAH Lam M. Nguyen Marten van Dijk Dzung Phan Phuong Ha Nguyen Tsui-Wei Weng Jayant Kalagnanam 76 23 0 22 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks Jinrong Guo Wantao Liu Wang Wang Q. Lu Songlin Hu Jizhong Han Ruixuan Li 62 9 0 21 Jan 2019
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization Sattar Vakili Sudeep Salgia Qing Zhao 49 7 0 17 Jan 2019
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization Xiao Fu Shahana Ibrahim Hoi-To Wai Cheng Gao Kejun Huang 134 37 0 16 Jan 2019
Optimization Problems for Machine Learning: A Survey Claudio Gambella Bissan Ghaddar Joe Naoum-Sawaya AI4CE 142 181 0 16 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers A. Koliousis Pijika Watcharapichat Matthias Weidlich Kai Zou Paolo Costa Peter R. Pietzuch 65 70 0 08 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path Yi Zhou Junjie Yang Huishuai Zhang Yingbin Liang Vahid Tarokh 79 74 0 02 Jan 2019
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis Salar Fattahi Somayeh Sojoudi 74 38 0 30 Dec 2018
On Lazy Training in Differentiable Programming Lénaïc Chizat Edouard Oyallon Francis R. Bach 111 840 0 19 Dec 2018
An Empirical Model of Large-Batch Training Sam McCandlish Jared Kaplan Dario Amodei OpenAI Dota Team 76 280 0 14 Dec 2018
Gradient Descent Happens in a Tiny Subspace Guy Gur-Ari Daniel A. Roberts Ethan Dyer 105 234 0 12 Dec 2018
Layer-Parallel Training of Deep Residual Neural Networks Stefanie Günther Lars Ruthotto J. Schroder E. Cyr N. Gauger 90 90 0 11 Dec 2018
Universal Adversarial Training A. Mendrik Mahyar Najibi Zheng Xu John P. Dickerson L. Davis Tom Goldstein AAML OOD 102 190 0 27 Nov 2018
Forward Stability of ResNet and Its Variants Linan Zhang Hayden Schaeffer 121 48 0 24 Nov 2018
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization Ömer Deniz Akyildiz Dan Crisan Joaquín Míguez 67 6 0 23 Nov 2018
A Sufficient Condition for Convergences of Adam and RMSProp Fangyu Zou Li Shen Zequn Jie Weizhong Zhang Wei Liu 81 373 0 23 Nov 2018
New Convergence Aspects of Stochastic Gradient Algorithms Lam M. Nguyen Phuong Ha Nguyen Peter Richtárik K. Scheinberg Martin Takáč Marten van Dijk 141 66 0 10 Nov 2018
A Bayesian Perspective of Statistical Machine Learning for Big Data R. Sambasivan Sourish Das S. Sahu BDL GP 61 20 0 09 Nov 2018
Double Adaptive Stochastic Gradient Optimization Rajaditya Mukherjee Jin Li Shicheng Chu Huamin Wang ODL 53 0 0 06 Nov 2018
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent Avetik G. Karagulyan 23 1 0 02 Nov 2018
A general system of differential equations to model first order adaptive algorithms André Belotto da Silva Maxime Gazeau 89 34 0 31 Oct 2018
SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms Zhe Wang Kaiyi Ji Yi Zhou Yingbin Liang Vahid Tarokh ODL 98 82 0 25 Oct 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD Jianyu Wang Gauri Joshi FedML 110 232 0 19 Oct 2018
First-order and second-order variants of the gradient descent in a unified framework Thomas Pierrot Nicolas Perrin Olivier Sigaud ODL 69 7 0 18 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning Aurick Qiao Bryon Aragam Bingjing Zhang Eric Xing 76 42 0 17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks Xiaodong Cui Wei Zhang Zoltán Tüske M. Picheny ODL 90 91 0 16 Oct 2018
Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks Zhibin Liao Tom Drummond Ian Reid G. Carneiro 80 23 0 16 Oct 2018
Deep Reinforcement Learning Yuxi Li VLM OffRL 194 144 0 15 Oct 2018
Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD Phuong Ha Nguyen Lam M. Nguyen Marten van Dijk LRM 75 32 0 10 Oct 2018
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD Marten van Dijk Lam M. Nguyen Phuong Ha Nguyen Dzung Phan 95 6 0 09 Oct 2018
Information Geometry of Orthogonal Initializations and Training Piotr A. Sokól Il-Su Park AI4CE 136 17 0 09 Oct 2018
Accelerating Stochastic Gradient Descent Using Antithetic Sampling Jingchang Liu Linli Xu 49 2 0 07 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms Antonio Orvieto Aurelien Lucchi 119 32 0 05 Oct 2018
Combining Natural Gradient with Hessian Free Methods for Sequence Training Adnan Haider P. Woodland ODL 48 4 0 03 Oct 2018
Large batch size training of neural networks with adversarial training and second-order information Z. Yao A. Gholami Daiyaan Arfeen Richard Liaw Joseph E. Gonzalez Kurt Keutzer Michael W. Mahoney ODL 96 42 0 02 Oct 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse Sangkug Lym Armand Behroozi W. Wen Ge Li Yongkee Kwon M. Erez 41 26 0 30 Sep 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation A. Wills Carl Jidling Thomas B. Schon ODL 64 7 0 29 Sep 2018
Fluctuation-dissipation relations for stochastic gradient descent Sho Yaida 121 75 0 28 Sep 2018
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview Yuejie Chi Yue M. Lu Yuxin Chen 75 427 0 25 Sep 2018
Predictive Collective Variable Discovery with Deep Bayesian Models M. Schöberl N. Zabaras P. Koutsourelakis 66 34 0 18 Sep 2018
MotherNets: Rapid Deep Ensemble Learning Abdul Wasay Brian Hentschel Yuze Liao Sanyuan Chen Stratos Idreos 58 35 0 12 Sep 2018
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection Wenchi Ma Yuanwei Wu Zongbo Wang Guanghui Wang ObjD 73 25 0 06 Sep 2018
Compositional Stochastic Average Gradient for Machine Learning and Related Applications Tsung-Yu Hsieh Y. El-Manzalawy Yiwei Sun Vasant Honavar 44 1 0 04 Sep 2018
Distributed Nonconvex Constrained Optimization over Time-Varying Digraphs G. Scutari Ying Sun 100 176 0 04 Sep 2018
Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant D. Loroch Franz-Josef Pfreundt Norbert Wehn J. Keuper 46 5 0 27 Aug 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms Jianyu Wang Gauri Joshi 196 350 0 22 Aug 2018
Backtracking gradient descent method for general $C^1$ functions, with applications to Deep Learning T. Truong T. H. Nguyen 73 10 0 15 Aug 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization Xiangyi Chen Sijia Liu Ruoyu Sun Mingyi Hong 101 324 0 08 Aug 2018
Particle Filtering Methods for Stochastic Optimization with Application to Large-Scale Empirical Risk Minimization Bin Liu 66 11 0 23 Jul 2018
Training Neural Networks Using Features Replay Zhouyuan Huo Bin Gu Heng-Chiao Huang 94 70 0 12 Jul 2018