A Progressive Batching L-BFGS Method for Machine Learning

15 February 2018

Papers citing "A Progressive Batching L-BFGS Method for Machine Learning"

22 / 22 papers shown

Title
SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster Large-Scale Statistical Learning Jingruo Sun Zachary Frangella Madeleine Udell 38 0 0 28 Jan 2025
Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients Sachin Garg A. Berahas Michal Dereziñski 46 1 0 23 Apr 2024
Faster Convergence for Transformer Fine-tuning with Line Search Methods Philip Kenneweg Leonardo Galli Tristan Kenneweg Barbara Hammer ODL 48 2 0 27 Mar 2024
On the efficiency of Stochastic Quasi-Newton Methods for Deep Learning M. Yousefi Angeles Martinez ODL 18 1 0 18 May 2022
Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic Optimization Raghu Bollapragada Stefan M. Wild 35 11 0 24 Sep 2021
L-DQN: An Asynchronous Limited-Memory Distributed Quasi-Newton Method Bugra Can Saeed Soori M. Dehnavi Mert Gurbuzbalaban 45 2 0 20 Aug 2021
Human Pose and Shape Estimation from Single Polarization Images Shihao Zou Wei Ji Sen Wang Yiming Qian Chuan Guo Li Cheng 3DH 43 22 0 15 Aug 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features Zhihui Zhu Tianyu Ding Jinxin Zhou Xiao Li Chong You Jeremias Sulam Qing Qu 40 197 0 06 May 2021
libEnsemble: A Library to Coordinate the Concurrent Evaluation of Dynamic Ensembles of Calculations S. Hudson Jeffrey Larson John-Luke Navarro Stefan M. Wild 16 28 0 16 Apr 2021
An Adaptive Memory Multi-Batch L-BFGS Algorithm for Neural Network Training Federico Zocco Seán F. McLoone ODL 26 4 0 14 Dec 2020
Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm Lucas N. Egidio A. Hansson B. Wahlberg 30 12 0 03 Oct 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization Neha S. Wadia Daniel Duckworth S. Schoenholz Ethan Dyer Jascha Narain Sohl-Dickstein 29 13 0 17 Aug 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 39 275 0 01 Jun 2020
Stochastic Calibration of Radio Interferometers S. Yatawatta 14 6 0 02 Mar 2020
CSM-NN: Current Source Model Based Logic Circuit Simulation -- A Neural Network Approach M. Abrishami Massoud Pedram Shahin Nazarian 17 7 0 13 Feb 2020
Stochastic quasi-Newton with line-search regularization A. Wills Thomas B. Schon ODL 29 21 0 03 Sep 2019
Adaptive Deep Learning for High-Dimensional Hamilton-Jacobi-Bellman Equations Tenavi Nakamura-Zimmerer Q. Gong W. Kang 26 132 0 11 Jul 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates Sharan Vaswani Aaron Mishkin I. Laradji Mark Schmidt Gauthier Gidel Simon Lacoste-Julien ODL 34 205 0 24 May 2019
Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization Filip de Roos Philipp Hennig AI4CE ODL 22 3 0 20 Feb 2019
Large batch size training of neural networks with adversarial training and second-order information Z. Yao A. Gholami Daiyaan Arfeen Richard Liaw Joseph E. Gonzalez Kurt Keutzer Michael W. Mahoney ODL 14 42 0 02 Oct 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation A. Wills Carl Jidling Thomas B. Schon ODL 36 7 0 29 Sep 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 310 2,896 0 15 Sep 2016