v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 867 papers shown

Title
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account Khashayar Namdar M. Haider Farzad Khalvati 47 26 0 08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization Wei Tao Zhisong Pan Gao-wei Wu Qing Tao 40 19 0 08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis Courtney Paquette B. V. Merrienboer Elliot Paquette Fabian Pedregosa 99 27 0 08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm Majid Jahani M. Nazari R. Tappenden A. Berahas Martin Takávc ODL 55 10 0 06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization Valerii Likhosherstov Xingyou Song K. Choromanski Jared Davis Adrian Weller 130 7 0 05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees Yu Sun Zihui Wu Xiaojian Xu B. Wohlberg Ulugbek S. Kamilov BDL 100 76 0 05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent Rémi Leluc Franccois Portier 88 4 0 04 Jun 2020
A mathematical model for automatic differentiation in machine learning Jérôme Bolte Edouard Pauwels 82 68 0 03 Jun 2020
Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations Zheng Shi Nur Sila Gulgec A. Berahas S. Pakzad Martin Takáč 63 10 0 02 Jun 2020
Carathéodory Sampling for Stochastic Gradient Descent Francesco Cosentino Harald Oberhauser Alessandro Abate 43 1 0 02 Jun 2020
Artificial neural networks for neuroscientists: A primer G. R. Yang Xiao-Jing Wang 107 255 0 01 Jun 2020
Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19 Pandemic: Leveraging Data Science, Epidemiology and Control Theory Teodoro Alamo Daniel Gutiérrez-Reina P. Millán 52 27 0 01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 160 287 0 01 Jun 2020
A New Accelerated Stochastic Gradient Method with Momentum Liang Liu Xiaopeng Luo ODL 39 3 0 31 May 2020
Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts Bo Pang Kaiwen Zha Hanwen Cao Jiajun Tang Minghui Yu Cewu Lu 77 25 0 30 May 2020
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing O. Borysenko M. Byshkin ODL 60 14 0 29 May 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism Jay H. Park Gyeongchan Yun Chang Yi N. T. Nguyen Seungmin Lee Jaesik Choi S. Noh Young-ri Choi MoE 89 134 0 28 May 2020
Convergence Analysis of Riemannian Stochastic Approximation Schemes Alain Durmus P. Jiménez Eric Moulines Salem Said Hoi-To Wai 72 10 0 27 May 2020
Scalable Privacy-Preserving Distributed Learning D. Froelicher J. Troncoso-Pastoriza Apostolos Pyrgelis Sinem Sav João Sá Sousa Jean-Philippe Bossuat Jean-Pierre Hubaux FedML 97 70 0 19 May 2020
PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking Chong Xiang A. Bhagoji Vikash Sehwag Prateek Mittal AAML 75 29 0 17 May 2020
S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs Muhammad I. Qureshi Ran Xin S. Kar U. Khan 92 34 0 15 May 2020
Interpreting Rate-Distortion of Variational Autoencoder and Using Model Uncertainty for Anomaly Detection Seonho Park George Adosoglou P. Pardalos DRL UQCV 105 17 0 05 May 2020
Dynamic backup workers for parallel machine learning Chuan Xu Giovanni Neglia Nicola Sebastianelli 72 11 0 30 Apr 2020
The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent Xin-Yao Qian Diego Klabjan ODL 72 36 0 27 Apr 2020
Heterogeneous CPU+GPU Stochastic Gradient Descent Algorithms Yujing Ma Florin Rusu 33 3 0 19 Apr 2020
On Learning Rates and Schrödinger Operators Bin Shi Weijie J. Su Michael I. Jordan 95 61 0 15 Apr 2020
Stochastic batch size for adaptive regularization in deep network optimization Kensuke Nakamura Stefano Soatto Byung-Woo Hong ODL 51 6 0 14 Apr 2020
Straggler-aware Distributed Learning: Communication Computation Latency Trade-off Emre Ozfatura S. Ulukus Deniz Gunduz 56 42 0 10 Apr 2020
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration Wenlong Mou C. J. Li Martin J. Wainwright Peter L. Bartlett Michael I. Jordan 85 76 0 09 Apr 2020
Deep Neural Network Learning with Second-Order Optimizers -- a Practical Study with a Stochastic Quasi-Gauss-Newton Method C. Thiele Mauricio Araya-Polo D. Hohl ODL 35 2 0 06 Apr 2020
Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions V. Patel 81 23 0 01 Apr 2020
Concentrated Differentially Private and Utility Preserving Federated Learning Rui Hu Yuanxiong Guo Yanmin Gong FedML 66 12 0 30 Mar 2020
Differentially Private Federated Learning for Resource-Constrained Internet of Things Rui Hu Yuanxiong Guo E. Ratazzi Yanmin Gong FedML 62 18 0 28 Mar 2020
A Hybrid-Order Distributed SGD Method for Non-Convex Optimization to Balance Communication Overhead, Computational Complexity, and Convergence Rate Naeimeh Omidvar M. Maddah-ali Hamed Mahdavi ODL 42 3 0 27 Mar 2020
Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence Abhishek Gupta W. Haskell 31 5 0 25 Mar 2020
Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness Thinh T. Doan Lam M. Nguyen Nhan H. Pham Justin Romberg 75 22 0 24 Mar 2020
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates Anastasia Koloskova Nicolas Loizou Sadra Boreiri Martin Jaggi Sebastian U. Stich FedML 95 518 0 23 Mar 2020
Block Layer Decomposition schemes for training Deep Neural Networks L. Palagi R. Seccia 47 5 0 18 Mar 2020
The Implicit Regularization of Stochastic Gradient Flow for Least Squares Alnur Ali Yan Sun Robert Tibshirani 103 77 0 17 Mar 2020
Dynamic transformation of prior knowledge into Bayesian models for data streams Tran Xuan Bach N. Anh Ngo Van Linh Khoat Than 69 9 0 13 Mar 2020
Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning Christopher Zach Huu Le 55 4 0 12 Mar 2020
Machine Learning on Volatile Instances Xiaoxi Zhang Jianyu Wang Gauri Joshi Carlee Joe-Wong 56 25 0 12 Mar 2020
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings Mahmoud Assran Michael G. Rabbat 78 59 0 27 Feb 2020
Disentangling Adaptive Gradient Methods from Learning Rates Naman Agarwal Rohan Anil Elad Hazan Tomer Koren Cyril Zhang 109 38 0 26 Feb 2020
PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models Yinjun Wu V. Tannen S. Davidson 86 37 0 26 Feb 2020
LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning Tianyi Chen Yuejiao Sun W. Yin FedML 47 14 0 26 Feb 2020
Device Heterogeneity in Federated Learning: A Superquantile Approach Yassine Laguel Krishna Pillutla J. Malick Zaïd Harchaoui FedML 94 22 0 25 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs Lei Huang Jie Qin Li Liu Fan Zhu Ling Shao AI4CE 86 11 0 25 Feb 2020
$Can speed up the convergence rate of stochastic gradient methods to $\mathcal{O}(1/k^2)$ by a gradient averaging strategy?$ Can speed up the convergence rate of stochastic gradient methods to $\mathcal{O}(1/k^2)$ by a gradient averaging strategy? Xin Xu Xiaopeng Luo 23 1 0 25 Feb 2020
Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent Bao Wang T. Nguyen Andrea L. Bertozzi Richard G. Baraniuk Stanley J. Osher ODL 77 49 0 24 Feb 2020