v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown

Title
The Marginal Value of Momentum for Small Learning Rate SGD Runzhe Wang Sadhika Malladi Tianhao Wang Kaifeng Lyu Zhiyuan Li ODL 86 9 0 27 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case Meixuan He Yuqing Liang Jinlan Liu Dongpo Xu 85 9 0 20 Jul 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality Ziyang Wei Wanrong Zhu Wei Biao Wu 130 5 0 13 Jul 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds Xu Cai Cheuk Yin Lin Jelena Diakonikolas FedML 74 5 0 21 Jun 2023
Bootstrapped Representations in Reinforcement Learning Charline Le Lan Stephen Tu Mark Rowland Anna Harutyunyan Rishabh Agarwal Marc G. Bellemare Will Dabney OffRL OOD SSL 138 10 0 16 Jun 2023
Schema-learning and rebinding as mechanisms of in-context learning and emergence Siva K. Swaminathan Antoine Dedieu Rajkumar Vasudeva Raju Murray Shanahan Miguel Lazaro-Gredilla Dileep George 99 14 0 16 Jun 2023
Robustly Learning a Single Neuron via Sharpness Puqian Wang Nikos Zarifis Ilias Diakonikolas Jelena Diakonikolas 67 9 0 13 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent Tongtian Zhu Fengxiang He Kaixuan Chen Mingli Song Dacheng Tao 158 15 0 05 Jun 2023
Incentivizing Honesty among Competitors in Collaborative Learning and Optimization Florian E. Dorner Nikola Konstantinov Georgi Pashaliev Martin Vechev FedML 144 7 0 25 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods Junchi Yang Xiang Li Ilyas Fatkhullin Niao He 92 17 0 21 May 2023
Online Learning Under A Separable Stochastic Approximation Framework Min Gan Xiang-Xiang Su Guang-yong Chen Jing Chen 66 0 0 12 May 2023
Over-the-Air Federated Averaging with Limited Power and Privacy Budgets Na Yan Kezhi Wang Cunhua Pan K. K. Chai Feng Shu Jiangzhou Wang FedML 55 2 0 05 May 2023
Multilevel Monte Carlo estimators for derivative-free optimization under uncertainty F. Menhorn Gianluca Geraci D. Seidl Youssef M. Marzouk M. Eldred H. Bungartz 50 1 0 04 May 2023
A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification Mehran Pourvahab Seyed Jalaleddin Mousavirad Virginie Felizardo Pedro Gusmão Henriques Zacarias Hamzeh Mohammadigheymasi Nicholas D. Lane Seyed Nooreddin Jafari Nuno M. Garcia 56 3 0 04 May 2023
When Deep Learning Meets Polyhedral Theory: A Survey Joey Huchette Gonzalo Muñoz Thiago Serra Calvin Tsay AI4CE 160 37 0 29 Apr 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization Weisen Jiang Hansi Yang Yu Zhang James T. Kwok AAML 130 34 0 28 Apr 2023
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space Michael Diao Krishnakumar Balasubramanian Sinho Chewi Adil Salim BDL 68 29 0 10 Apr 2023
High-dimensional scaling limits and fluctuations of online least-squares SGD with smooth covariance Krishnakumar Balasubramanian Promit Ghosal Ye He 107 5 0 03 Apr 2023
FedAgg: Adaptive Federated Learning with Aggregated Gradients Wenhao Yuan Xuehe Wang FedML 145 1 0 28 Mar 2023
Forget-free Continual Learning with Soft-Winning SubNetworks Haeyong Kang Jaehong Yoon Sultan Rizky Hikmawan Madjid Sung Ju Hwang Chang D. Yoo CLL 105 4 0 27 Mar 2023
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation Ö. Deniz Akyildiz F. R. Crucinio Mark Girolami Tim Johnston Sotirios Sabanis 138 13 0 23 Mar 2023
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent Rahul Singh A. Shukla Dootika Vats 56 0 0 14 Mar 2023
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond Jaeyoung Cha Jaewook Lee Chulhee Yun 87 24 0 13 Mar 2023
Multi-task neural networks by learned contextual inputs Anders T. Sandnes B. Grimstad O. Kolbjørnsen 51 2 0 01 Mar 2023
Maximum Likelihood With a Time Varying Parameter Alberto Lanconelli Christopher S. A. Lauria 52 4 0 28 Feb 2023
Statistical Inference with Stochastic Gradient Methods under $φ$ -mixing Data Ruiqi Liu Xinyu Chen Zuofeng Shang FedML 84 6 0 24 Feb 2023
WW-FL: Secure and Private Large-Scale Federated Learning F. Marx T. Schneider Ajith Suresh Tobias Wehrle Christian Weinert Hossein Yalame FedML 66 2 0 20 Feb 2023
Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression Bhavya Agrawalla Krishnakumar Balasubramanian Promit Ghosal 109 2 0 20 Feb 2023
On the convergence result of the gradient-push algorithm on directed graphs with constant stepsize Woocheol Choi Doheon Kim S. Yun 67 1 0 17 Feb 2023
$Extragradient-Type Methods with $\mathcal{O} (1/k)$ Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions$ Extragradient-Type Methods with $\mathcal{O} (1/k)$ Last-Iterate Convergence Rates for Co-Hypomonotone Inclusions Quoc Tran-Dinh 81 2 0 08 Feb 2023
Improving the Model Consistency of Decentralized Federated Learning Yi Shi Li Shen Kang Wei Yan Sun Bo Yuan Xueqian Wang Dacheng Tao FedML 114 52 0 08 Feb 2023
Target-based Surrogates for Stochastic Optimization J. Lavington Sharan Vaswani Reza Babanezhad Mark Schmidt Nicolas Le Roux 99 6 0 06 Feb 2023
A Survey on Efficient Training of Transformers Bohan Zhuang Jing Liu Zizheng Pan Haoyu He Yuetian Weng Chunhua Shen 130 49 0 02 Feb 2023
MLPGradientFlow: going with the flow of multilayer perceptrons (and finding minima fast and accurately) Johanni Brea Flavio Martinelli Berfin Simsek W. Gerstner 35 4 0 25 Jan 2023
A Stochastic Proximal Polyak Step Size Fabian Schaipp Robert Mansel Gower M. Ulbrich 64 12 0 12 Jan 2023
Federated Learning under Heterogeneous and Correlated Client Availability Angelo Rodio Francescomaria Faticanti Othmane Marfoq Giovanni Neglia Emilio Leonardi FedML 84 27 0 11 Jan 2023
Sharper Analysis for Minibatch Stochastic Proximal Point Methods: Stability, Smoothness, and Deviation Xiao-Tong Yuan P. Li 89 2 0 09 Jan 2023
Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems Quoc Tran-Dinh Yang Luo 194 8 0 08 Jan 2023
Federated Learning for Data Streams Othmane Marfoq Giovanni Neglia Laetitia Kameni Richard Vidal FedML 82 12 0 04 Jan 2023
Variance Reduction for Score Functions Using Optimal Baselines Ronan L. Keane H. Gao 52 0 0 27 Dec 2022
Gradient Descent-Type Methods: Background and Simple Unified Convergence Analysis Quoc Tran-Dinh Marten van Dijk 58 0 0 19 Dec 2022
Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent Richard Archibald F. Bao Yanzhao Cao Hui‐Jie Sun 99 2 0 17 Dec 2022
Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks Chung-Hsuan Hu Zheng Chen Erik G. Larsson 86 73 0 14 Dec 2022
Learning useful representations for shifting tasks and distributions Jianyu Zhang Léon Bottou OOD 76 14 0 14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points Mayank Baranwal Param Budhraja V. Raj A. Hota 67 3 0 07 Dec 2022
Distributed Stochastic Gradient Descent with Cost-Sensitive and Strategic Agents Abdullah Basar Akbay C. Tepedelenlioğlu FedML 53 0 0 05 Dec 2022
Convergence of ease-controlled Random Reshuffling gradient Algorithms under Lipschitz smoothness R. Seccia Corrado Coppola G. Liuzzi L. Palagi 76 2 0 04 Dec 2022
Learning-Assisted Algorithm Unrolling for Online Optimization with Budget Constraints Jianyi Yang Shaolei Ren 85 2 0 03 Dec 2022
GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning Shiqi He Qifan Yan Feijie Wu Lanjun Wang Mathias Lécuyer Ivan Beschastnikh FedML 82 8 0 03 Dec 2022
Impact of Redundancy on Resilience in Distributed Optimization and Learning Shuo Liu Nirupam Gupta Nitin H. Vaidya 107 2 0 16 Nov 2022