Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,406 papers shown

Title
A Unified Batch Online Learning Framework for Click Prediction Rishabh K. Iyer Nimit Acharya Tanuja Bompada Denis Xavier Charles Eren Manavoglu 9 2 0 12 Sep 2018
MotherNets: Rapid Deep Ensemble Learning Abdul Wasay Brian Hentschel Yuze Liao Sanyuan Chen Stratos Idreos 8 35 0 12 Sep 2018
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection Wenchi Ma Yuanwei Wu Zongbo Wang Guanghui Wang ObjD 24 25 0 06 Sep 2018
Compositional Stochastic Average Gradient for Machine Learning and Related Applications Tsung-Yu Hsieh Y. El-Manzalawy Yiwei Sun Vasant Honavar 15 1 0 04 Sep 2018
Distributed Nonconvex Constrained Optimization over Time-Varying Digraphs G. Scutari Ying Sun 39 171 0 04 Sep 2018
Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant D. Loroch Franz-Josef Pfreundt Norbert Wehn J. Keuper 15 5 0 27 Aug 2018
Deep Learning: Computational Aspects Nicholas G. Polson Vadim Sokolov PINN BDL AI4CE 8 14 0 26 Aug 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms Jianyu Wang Gauri Joshi 33 348 0 22 Aug 2018
Experiential Robot Learning with Accelerated Neuroevolution Ahmed Aly J. Dugan 13 1 0 16 Aug 2018
Backtracking gradient descent method for general $C^1$ functions, with applications to Deep Learning T. Truong T. H. Nguyen 14 9 0 15 Aug 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization Xiangyi Chen Sijia Liu Ruoyu Sun Mingyi Hong 14 318 0 08 Aug 2018
Stochastic Gradient Descent with Biased but Consistent Gradient Estimators Jie Chen Ronny Luss 19 45 0 31 Jul 2018
Particle Filtering Methods for Stochastic Optimization with Application to Large-Scale Empirical Risk Minimization Bin Liu 16 10 0 23 Jul 2018
Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification Problems Chih-Hao Fang Sudhir B. Kylasa Fred Roosta Michael W. Mahoney A. Grama ODL 19 10 0 18 Jul 2018
Training Neural Networks Using Features Replay Zhouyuan Huo Bin Gu Heng-Chiao Huang 17 69 0 12 Jul 2018
Geometric Generalization Based Zero-Shot Learning Dataset Infinite World: Simple Yet Powerful R. Chidambaram Michael C. Kampffmeyer W. Neiswanger Xiaodan Liang T. Lachmann Eric P. Xing 13 0 0 10 Jul 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator Cong Fang C. J. Li Zhouchen Lin Tong Zhang 50 570 0 04 Jul 2018
Quasi-Monte Carlo Variational Inference Alexander K. Buchholz F. Wenzel Stephan Mandt BDL 25 58 0 04 Jul 2018
Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations Jennifer B. Erway J. Griffin Roummel F. Marcia Riadh Omheni 8 24 0 01 Jul 2018
Algorithms for solving optimization problems arising from deep neural net models: smooth problems Vyacheslav Kungurtsev Tomás Pevný 18 6 0 30 Jun 2018
Random Shuffling Beats SGD after Finite Epochs Jeff Z. HaoChen S. Sra 6 98 0 26 Jun 2018
Pushing the boundaries of parallel Deep Learning -- A practical approach Paolo Viviani M. Drocco Marco Aldinucci OOD 20 0 0 25 Jun 2018
Como funciona o Deep Learning M. Ponti G. B. P. D. Costa 29 13 0 20 Jun 2018
Laplacian Smoothing Gradient Descent Stanley Osher Bao Wang Penghang Yin Xiyang Luo Farzin Barekat Minh Pham A. Lin ODL 22 43 0 17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors Atsushi Nitanda Taiji Suzuki 8 10 0 14 Jun 2018
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam Mohammad Emtiyaz Khan Didrik Nielsen Voot Tangkaratt Wu Lin Y. Gal Akash Srivastava ODL 74 268 0 13 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models? Tengyu Xu Yi Zhou Kaiyi Ji Yingbin Liang 29 19 0 12 Jun 2018
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis Thomas George César Laurent Xavier Bouthillier Nicolas Ballas Pascal Vincent ODL 29 150 0 11 Jun 2018
Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs Bin Hu S. Wright Laurent Lessard 11 20 0 10 Jun 2018
Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data Shuai Zheng James T. Kwok 6 9 0 08 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation Jalaj Bhandari Daniel Russo Raghav Singal 18 334 0 06 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes Rachel A. Ward Xiaoxia Wu Léon Bottou ODL 27 359 0 05 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson Nathan Srebro Daniel Soudry FedML MLT 32 97 0 05 Jun 2018
Backdrop: Stochastic Backpropagation Siavash Golkar Kyle Cranmer 41 2 0 04 Jun 2018
Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients Sai Praneeth Karimireddy Sebastian U. Stich Martin Jaggi 21 50 0 01 Jun 2018
Accelerating Incremental Gradient Optimization with Curvature Information Hoi-To Wai Wei Shi César A. Uribe A. Nedić Anna Scaglione 6 12 0 31 May 2018
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation Jimmy Wu Bolei Zhou D. Peck S. Hsieh V. Dialani Lester W. Mackey Genevieve Patterson FAtt MedIm 20 24 0 31 May 2018
On Consensus-Optimality Trade-offs in Collaborative Deep Learning Zhanhong Jiang Aditya Balu C. Hegde S. Sarkar FedML 24 7 0 30 May 2018
Bayesian Learning with Wasserstein Barycenters Julio D. Backhoff Veraguas J. Fontbona Gonzalo Rios Felipe A. Tobar 20 29 0 28 May 2018
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes Loucas Pillaud-Vivien Alessandro Rudi Francis R. Bach 6 99 0 25 May 2018
Stochastic algorithms with descent guarantees for ICA Pierre Ablin Alexandre Gramfort J. Cardoso Francis R. Bach CML 9 7 0 25 May 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen G. Giannakis Tao Sun W. Yin 31 297 0 25 May 2018
A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training V. Dudar Giovanni Chierchia Émilie Chouzenoux J. Pesquet V. Semenov 14 5 0 23 May 2018
Predictive Local Smoothness for Stochastic Gradient Methods Jun Yu Li Hongfu Liu Bineng Zhong Yue Wu Y. Fu ODL 6 1 0 23 May 2018
Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization Soumyadip Ghosh M. Squillante Ebisa D. Wollega OOD 11 10 0 22 May 2018
LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks Ziming Zhang ODL 9 1 0 22 May 2018
Stochastic modified equations for the asynchronous stochastic gradient descent Jing An Jian-wei Lu Lexing Ying 21 79 0 21 May 2018
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes Xiaoyun Li Francesco Orabona 40 290 0 21 May 2018
Parallel and Distributed Successive Convex Approximation Methods for Big-Data Optimization G. Scutari Ying Sun 35 61 0 17 May 2018
Decoupled Parallel Backpropagation with Convergence Guarantee Zhouyuan Huo Bin Gu Qian Yang Heng-Chiao Huang 15 97 0 27 Apr 2018