v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown

Title
Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large Graphs Adil Salim Pascal Bianchi W. Hachem 65 17 0 19 Dec 2017
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning Siyuan Ma Raef Bassily M. Belkin 117 291 0 18 Dec 2017
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks Shankar Krishnan Ying Xiao Rif A. Saurous ODL 45 20 0 08 Dec 2017
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks Aditya Devarakonda Maxim Naumov M. Garland ODL 112 136 0 06 Dec 2017
A two-dimensional decomposition approach for matrix completion through gossip Mukul Bhutani Bamdev Mishra 26 0 0 21 Nov 2017
Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks Ziming Zhang M. Brand 59 71 0 20 Nov 2017
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning Ziming Zhang Yuanwei Wu Guanghui Wang ODL 65 28 0 19 Nov 2017
Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization Zhouyuan Huo Bin Gu Ji Liu Heng-Chiao Huang 93 51 0 10 Nov 2017
SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements Francisco J. R. Ruiz Susan Athey David M. Blei 415 85 0 09 Nov 2017
Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs Bin Hu Peter M. Seiler Laurent Lessard 121 40 0 03 Nov 2017
Don't Decay the Learning Rate, Increase the Batch Size Samuel L. Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le ODL 133 996 0 01 Nov 2017
Adaptive Sampling Strategies for Stochastic Optimization Raghu Bollapragada R. Byrd J. Nocedal 54 116 0 30 Oct 2017
On the role of synaptic stochasticity in training low-precision neural networks Carlo Baldassi Federica Gerace H. Kappen Carlo Lucibello Luca Saglietti Enzo Tartaglione R. Zecchina 55 23 0 26 Oct 2017
Avoiding Communication in Proximal Methods for Convex Optimization Problems Saeed Soori Aditya Devarakonda J. Demmel Mert Gurbuzbalaban M. Dehnavi 34 7 0 24 Oct 2017
Smart "Predict, then Optimize" Adam N. Elmachtoub Paul Grigas 104 613 0 22 Oct 2017
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition Chun Yang Xu-Cheng Yin Zejun Li Jianwei Wu Chunchao Guo Hongfa Wang Lei Xiao 44 10 0 10 Oct 2017
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible Emanuele Sansone F. D. De Natale 29 4 0 03 Oct 2017
How regularization affects the critical points in linear networks Amirhossein Taghvaei Jin-Won Kim P. Mehta 77 13 0 27 Sep 2017
Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form Maxim Naumov 82 9 0 16 Sep 2017
ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks Ervin Teng João Diogo Falcão Bob Iannucci 62 14 0 15 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems V. Patel MLT 73 8 0 14 Sep 2017
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study Peng Xu Farbod Roosta-Khorasani Michael W. Mahoney ODL 84 145 0 25 Aug 2017
Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information Peng Xu Farbod Roosta-Khorasani Michael W. Mahoney 133 214 0 23 Aug 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates L. Smith Nicholay Topin AI4CE 137 518 0 23 Aug 2017
Regularizing and Optimizing LSTM Language Models Stephen Merity N. Keskar R. Socher 178 1,098 0 07 Aug 2017
On the convergence properties of a $K$ -step averaging stochastic gradient descent algorithm for nonconvex optimization Fan Zhou Guojing Cong 186 236 0 03 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning A. Berahas Martin Takáč AAML ODL 111 44 0 26 Jul 2017
Warped Riemannian metrics for location-scale models Salem Said Lionel Bombrun Y. Berthoumieu 76 15 0 22 Jul 2017
Stochastic, Distributed and Federated Optimization for Machine Learning Jakub Konecný FedML 83 38 0 04 Jul 2017
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning Frank E. Curtis K. Scheinberg 100 45 0 30 Jun 2017
Efficiency of quantum versus classical annealing in non-convex learning problems Carlo Baldassi R. Zecchina 78 45 0 26 Jun 2017
Faster independent component analysis by preconditioning with Hessian approximations Pierre Ablin J. Cardoso Alexandre Gramfort CML 87 127 0 25 Jun 2017
Collaborative Deep Learning in Fixed Topology Networks Zhanhong Jiang Aditya Balu Chinmay Hegde Soumik Sarkar FedML 82 181 0 23 Jun 2017
Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations Jialei Wang Tong Zhang 80 12 0 21 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning Dong Yin A. Pananjady Max Lam Dimitris Papailiopoulos Kannan Ramchandran Peter L. Bartlett 89 11 0 18 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations Simone Scardapane Paolo Di Lorenzo 43 9 0 15 Jun 2017
Proximal Backpropagation Thomas Frerix Thomas Möllenhoff Michael Möller Zorah Lähner 66 31 0 14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 226 3,692 0 08 Jun 2017
Diminishing Batch Normalization Yintai Ma Diego Klabjan 49 15 0 22 May 2017
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD Mehmet A. Donmez Maxim Raginsky A. Singer FedML 16 0 0 19 May 2017
An Investigation of Newton-Sketch and Subsampled Newton Methods A. Berahas Raghu Bollapragada J. Nocedal 104 114 0 17 May 2017
Efficient Parallel Methods for Deep Reinforcement Learning Alfredo V. Clemente Humberto Nicolás Castejón Martínez A. Chandra 85 115 0 13 May 2017
Stable Architectures for Deep Neural Networks E. Haber Lars Ruthotto 174 736 0 09 May 2017
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering Hsiou-Yuan Liu Dehong Liu Hassan Mansour P. Boufounos Laura Waller Ulugbek S. Kamilov 50 77 0 05 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning Julia Kreutzer Artem Sokolov Stefan Riezler 85 49 0 21 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep neural networks Pratik Chaudhari Adam M. Oberman Stanley Osher Stefano Soatto G. Carlier 174 154 0 17 Apr 2017
Inference via low-dimensional couplings Alessio Spantini Daniele Bigoni Youssef Marzouk 145 119 0 17 Mar 2017
Sharp Minima Can Generalize For Deep Nets Laurent Dinh Razvan Pascanu Samy Bengio Yoshua Bengio ODL 147 774 0 15 Mar 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis Hiroyuki Kasai Hiroyuki Sato Bamdev Mishra 65 22 0 15 Mar 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient Lam M. Nguyen Jie Liu K. Scheinberg Martin Takáč ODL 177 608 0 01 Mar 2017