v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown

Title
Quasi-Monte Carlo Variational Inference Alexander K. Buchholz F. Wenzel Stephan Mandt BDL 105 60 0 04 Jul 2018
Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations Jennifer B. Erway J. Griffin Roummel F. Marcia Riadh Omheni 63 24 0 01 Jul 2018
Algorithms for solving optimization problems arising from deep neural net models: smooth problems Vyacheslav Kungurtsev Tomás Pevný 48 6 0 30 Jun 2018
Random Shuffling Beats SGD after Finite Epochs Jeff Z. HaoChen S. Sra 98 99 0 26 Jun 2018
Laplacian Smoothing Gradient Descent Stanley Osher Bao Wang Penghang Yin Xiyang Luo Farzin Barekat Minh Pham A. Lin ODL 113 43 0 17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors Atsushi Nitanda Taiji Suzuki 77 10 0 14 Jun 2018
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam Mohammad Emtiyaz Khan Didrik Nielsen Voot Tangkaratt Wu Lin Y. Gal Akash Srivastava ODL 200 271 0 13 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models? Tengyu Xu Yi Zhou Kaiyi Ji Yingbin Liang 90 19 0 12 Jun 2018
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis Thomas George César Laurent Xavier Bouthillier Nicolas Ballas Pascal Vincent ODL 115 156 0 11 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation Jalaj Bhandari Daniel Russo Raghav Singal 115 340 0 06 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes Rachel A. Ward Xiaoxia Wu Léon Bottou ODL 115 369 0 05 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate Mor Shpigel Nacson Nathan Srebro Daniel Soudry FedML MLT 102 102 0 05 Jun 2018
Backdrop: Stochastic Backpropagation Siavash Golkar Kyle Cranmer 52 2 0 04 Jun 2018
Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients Sai Praneeth Karimireddy Sebastian U. Stich Martin Jaggi 86 52 0 01 Jun 2018
Accelerating Incremental Gradient Optimization with Curvature Information Hoi-To Wai Wei Shi César A. Uribe A. Nedić Anna Scaglione 40 12 0 31 May 2018
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation Jimmy Wu Bolei Zhou D. Peck S. Hsieh V. Dialani Lester W. Mackey Genevieve Patterson FAtt MedIm 71 24 0 31 May 2018
Bayesian Learning with Wasserstein Barycenters Julio D. Backhoff Veraguas J. Fontbona Gonzalo Rios Felipe A. Tobar 64 31 0 28 May 2018
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes Loucas Pillaud-Vivien Alessandro Rudi Francis R. Bach 179 103 0 25 May 2018
Stochastic algorithms with descent guarantees for ICA Pierre Ablin Alexandre Gramfort J. Cardoso Francis R. Bach CML 32 7 0 25 May 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen G. Giannakis Tao Sun W. Yin 60 299 0 25 May 2018
LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks Ziming Zhang ODL 26 1 0 22 May 2018
Stochastic modified equations for the asynchronous stochastic gradient descent Jing An Jian-wei Lu Lexing Ying 77 79 0 21 May 2018
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes Xiaoyun Li Francesco Orabona 89 299 0 21 May 2018
Parallel and Distributed Successive Convex Approximation Methods for Big-Data Optimization G. Scutari Ying Sun 105 64 0 17 May 2018
Decoupled Parallel Backpropagation with Convergence Guarantee Zhouyuan Huo Bin Gu Qian Yang Heng-Chiao Huang 98 97 0 27 Apr 2018
Revisiting Small Batch Training for Deep Neural Networks Dominic Masters Carlo Luschi ODL 83 671 0 20 Apr 2018
Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling Dmitry Babichev Francis R. Bach 62 9 0 16 Apr 2018
Sequence Training of DNN Acoustic Models With Natural Gradient Adnan Haider P. Woodland 41 7 0 06 Apr 2018
A Constant Step Stochastic Douglas-Rachford Algorithm with Application to Non Separable Regularizations Adil Salim Pascal Bianchi W. Hachem 72 2 0 03 Apr 2018
Training Tips for the Transformer Model Martin Popel Ondrej Bojar 110 312 0 01 Apr 2018
Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates Arnulf Jentzen Philippe von Wurstemberger 101 31 0 22 Mar 2018
Group Normalization Yuxin Wu Kaiming He 261 3,686 0 22 Mar 2018
Efficient FPGA Implementation of Conjugate Gradient Methods for Laplacian System using HLS Sahithi Rampalli N. Sehgal Ishita Bindlish Tanya Tyagi Pawan Kumar 33 4 0 10 Mar 2018
A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization Andre Milzarek X. Xiao Shicong Cen Zaiwen Wen M. Ulbrich 66 36 0 09 Mar 2018
WNGrad: Learn the Learning Rate in Gradient Descent Xiaoxia Wu Rachel A. Ward Léon Bottou 70 87 0 07 Mar 2018
DAGs with NO TEARS: Continuous Optimization for Structure Learning Xun Zheng Bryon Aragam Pradeep Ravikumar Eric Xing NoLa CML OffRL 113 953 0 04 Mar 2018
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD Sanghamitra Dutta Gauri Joshi Soumyadip Ghosh Parijat Dube P. Nagpurkar 82 198 0 03 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Tal Ben-Nun Torsten Hoefler GNN 87 713 0 26 Feb 2018
GPU Accelerated Sub-Sampled Newton's Method Sudhir B. Kylasa Farbod Roosta-Khorasani Michael W. Mahoney A. Grama ODL 79 8 0 26 Feb 2018
Complex-valued Neural Networks with Non-parametric Activation Functions Simone Scardapane S. Van Vaerenbergh Amir Hussain A. Uncini 81 84 0 22 Feb 2018
Spurious Valleys in Two-layer Neural Network Optimization Landscapes Luca Venturi Afonso S. Bandeira Joan Bruna 97 75 0 18 Feb 2018
Convergence of Online Mirror Descent Yunwen Lei Ding-Xuan Zhou 60 21 0 18 Feb 2018
Stochastic quasi-Newton with adaptive step lengths for large-scale problems A. Wills Thomas B. Schon 63 9 0 12 Feb 2018
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption Lam M. Nguyen Phuong Ha Nguyen Marten van Dijk Peter Richtárik K. Scheinberg Martin Takáč 113 228 0 11 Feb 2018
Estimating Heterogeneous Consumer Preferences for Restaurants and Travel Time Using Mobile Location Data Susan Athey David M. Blei Rob Donnelly Francisco J. R. Ruiz Tobias Schmidt 43 66 0 22 Jan 2018
When Does Stochastic Gradient Algorithm Work Well? Lam M. Nguyen Nam H. Nguyen Dzung Phan Jayant Kalagnanam K. Scheinberg 86 15 0 18 Jan 2018
MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning Amith R. Mamidala Georgios Kollias C. Ward F. Artico 78 20 0 11 Jan 2018
Gradient-based Optimization for Regression in the Functional Tensor-Train Format Alex A. Gorodetsky J. Jakeman 76 34 0 03 Jan 2018
A Stochastic Trust Region Algorithm Based on Careful Step Normalization Frank E. Curtis K. Scheinberg R. Shi 75 45 0 29 Dec 2017
Geometrical Insights for Implicit Generative Modeling Léon Bottou Martín Arjovsky David Lopez-Paz Maxime Oquab 75 50 0 21 Dec 2017