Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,406 papers shown

Title
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible Emanuele Sansone F. D. De Natale 24 4 0 03 Oct 2017
How regularization affects the critical points in linear networks Amirhossein Taghvaei Jin-Won Kim P. Mehta 26 13 0 27 Sep 2017
On Principal Components Regression, Random Projections, and Column Subsampling M. Slawski 9 20 0 23 Sep 2017
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form Maxim Naumov 23 9 0 16 Sep 2017
ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks Ervin Teng João Diogo Falcão Bob Iannucci 33 14 0 15 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems V. Patel MLT 17 8 0 14 Sep 2017
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study Peng Xu Farbod Roosta-Khorasani Michael W. Mahoney ODL 14 143 0 25 Aug 2017
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information Peng Xu Farbod Roosta-Khorasani Michael W. Mahoney 28 210 0 23 Aug 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates L. Smith Nicholay Topin AI4CE 20 519 0 23 Aug 2017
Regularizing and Optimizing LSTM Language Models Stephen Merity N. Keskar R. Socher 60 1,091 0 07 Aug 2017
On the convergence properties of a $K$ -step averaging stochastic gradient descent algorithm for nonconvex optimization Fan Zhou Guojing Cong 46 232 0 03 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning A. Berahas Martin Takáč AAML ODL 19 44 0 26 Jul 2017
Warped Riemannian metrics for location-scale models Salem Said Lionel Bombrun Y. Berthoumieu 37 15 0 22 Jul 2017
Stochastic, Distributed and Federated Optimization for Machine Learning Jakub Konecný FedML 26 38 0 04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning Frank E. Curtis K. Scheinberg 39 45 0 30 Jun 2017
Efficiency of quantum versus classical annealing in non-convex learning problems Carlo Baldassi R. Zecchina 16 43 0 26 Jun 2017
Faster independent component analysis by preconditioning with Hessian approximations Pierre Ablin J. Cardoso Alexandre Gramfort CML 28 124 0 25 Jun 2017
Collaborative Deep Learning in Fixed Topology Networks Zhanhong Jiang Aditya Balu C. Hegde S. Sarkar FedML 21 179 0 23 Jun 2017
Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations Jialei Wang Tong Zhang 19 12 0 21 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning Dong Yin A. Pananjady Max Lam Dimitris Papailiopoulos Kannan Ramchandran Peter L. Bartlett 9 11 0 18 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations Simone Scardapane P. Di Lorenzo 22 9 0 15 Jun 2017
Proximal Backpropagation Thomas Frerix Thomas Möllenhoff Michael Möller Daniel Cremers 23 31 0 14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 22 3,649 0 08 Jun 2017
Diagonal Rescaling For Neural Networks Jean Lafond Nicolas Vasilache Léon Bottou 6 11 0 25 May 2017
Diminishing Batch Normalization Yintai Ma Diego Klabjan 31 15 0 22 May 2017
On the diffusion approximation of nonconvex stochastic gradient descent Junyang Qian C. J. Li Lei Li Jianguo Liu DiffM 23 24 0 22 May 2017
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD Mehmet A. Donmez Maxim Raginsky A. Singer FedML 9 0 0 19 May 2017
An Investigation of Newton-Sketch and Subsampled Newton Methods A. Berahas Raghu Bollapragada J. Nocedal 19 111 0 17 May 2017
Efficient Parallel Methods for Deep Reinforcement Learning Alfredo V. Clemente Humberto Nicolás Castejón Martínez A. Chandra 9 114 0 13 May 2017
Stable Architectures for Deep Neural Networks E. Haber Lars Ruthotto 23 714 0 09 May 2017
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering Hsiou-Yuan Liu Dehong Liu Hassan Mansour P. Boufounos Laura Waller Ulugbek S. Kamilov 9 75 0 05 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning Julia Kreutzer Artem Sokolov Stefan Riezler 27 49 0 21 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks Pratik Chaudhari Adam M. Oberman Stanley Osher Stefano Soatto G. Carlier 27 153 0 17 Apr 2017
Inference via low-dimensional couplings Alessio Spantini Daniele Bigoni Youssef Marzouk 38 119 0 17 Mar 2017
Sharp Minima Can Generalize For Deep Nets Laurent Dinh Razvan Pascanu Samy Bengio Yoshua Bengio ODL 46 757 0 15 Mar 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis Hiroyuki Kasai Hiroyuki Sato Bamdev Mishra 13 22 0 15 Mar 2017
Learning across scales - A multiscale method for Convolution Neural Networks E. Haber Lars Ruthotto E. Holtham Seong-Hwan Jun 17 23 0 06 Mar 2017
Stochastic Functional Gradient for Motion Planning in Continuous Occupancy Maps Gilad Francis Lionel Ott F. Ramos 16 16 0 01 Mar 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient Lam M. Nguyen Jie Liu K. Scheinberg Martin Takáč ODL 28 597 0 01 Mar 2017
Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems Julianne Chung Matthias Chung J. T. Slagel L. Tenorio 27 11 0 23 Feb 2017
On SGD's Failure in Practice: Characterizing and Overcoming Stalling V. Patel 16 1 0 01 Feb 2017
Stochastic Subsampling for Factorizing Huge Matrices A. Mensch Julien Mairal B. Thirion Gaël Varoquaux 9 30 0 19 Jan 2017
Towards Principled Methods for Training Generative Adversarial Networks Martín Arjovsky M. Nault GAN 27 2,096 0 17 Jan 2017
Stochastic Generative Hashing Bo Dai Ruiqi Guo Sanjiv Kumar Niao He Le Song TPM 35 106 0 11 Jan 2017
Coupling Adaptive Batch Sizes with Learning Rates Lukas Balles Javier Romero Philipp Hennig ODL 21 110 0 15 Dec 2016
Federated Optimization: Distributed Machine Learning for On-Device Intelligence Jakub Konecný H. B. McMahan Daniel Ramage Peter Richtárik FedML 60 1,878 0 08 Oct 2016
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure A. Bietti Julien Mairal 44 36 0 04 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016
Benchmarking State-of-the-Art Deep Learning Software Tools S. Shi Qiang-qiang Wang Pengfei Xu Xiaowen Chu BDL 14 327 0 25 Aug 2016
DOOMED: Direct Online Optimization of Modeling Errors in Dynamics Nathan D. Ratliff Franziska Meier Daniel Kappler S. Schaal 17 17 0 01 Aug 2016