v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown

Title
Rethinking LLM Training through Information Geometry and Quantum Metrics Riccardo Di Sipio 14 0 0 18 Jun 2025
ImprovDML: Improved Trade-off in Private Byzantine-Resilient Distributed Machine Learning Bing Liu Chengcheng Zhao L. Chai Peng Cheng Yaonan Wang FedML 22 0 0 18 Jun 2025
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting Duc Toan Nguyen Trang H. Tran Lam M. Nguyen 22 0 0 14 Jun 2025
Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters Mathukumalli Vidyasagar 73 0 0 13 Jun 2025
Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise Chuan He Zhaosong Lu Defeng Sun Zhanwang Deng 30 0 0 12 Jun 2025
NDCG-Consistent Softmax Approximation with Accelerated Convergence Yuanhao Pu Defu Lian Xiaolong Chen Xu Huang Jin Chen Enhong Chen 53 0 0 11 Jun 2025
Neural Tangent Kernel Analysis to Probe Convergence in Physics-informed Neural Solvers: PIKANs vs. PINNs Salah A Faroughi Farinaz Mostajeran 15 0 0 09 Jun 2025
A Stable Whitening Optimizer for Efficient Neural Network Training Kevin Frans Sergey Levine Pieter Abbeel 35 0 0 08 Jun 2025
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner Runa Eschenhagen Aaron Defazio Tsung-Hsien Lee Richard Turner Hao-Jun Michael Shi 93 0 0 04 Jun 2025
Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs Riccardo Tenderini Luca Pegolotti Fanwei Kong S. Pagani Francesco Regazzoni Alison L. Marsden S. Deparis MedIm AI4CE 49 0 0 01 Jun 2025
Stationary MMD Points for Cubature Zonghao Chen Toni Karvonen Heishiro Kanagawa F. Briol Chris J. Oates 82 0 0 27 May 2025
Moment Expansions of the Energy Distance Ian Langmore 17 0 0 27 May 2025
Dynamic Manifold Evolution Theory: Modeling and Stability Analysis of Latent Representations in Large Language Models Yukun Zhang Qi Dong AI4CE 28 0 0 24 May 2025
Implicit Neural Shape Optimization for 3D High-Contrast Electrical Impedance Tomography Junqing Chen Haibo Liu 258 0 0 22 May 2025
Never Skip a Batch: Continuous Training of Temporal GNNs via Adaptive Pseudo-Supervision Alexander Panyshev Dmitry Vinichenko Oleg Travkin Roman Alferov Alexey Zaytsev 69 0 0 18 May 2025
Dynamic Perturbed Adaptive Method for Infinite Task-Conflicting Time Series Jiang You Xiaozhen Wang Arben Cela AI4TS 82 0 0 17 May 2025
A stochastic gradient method for trilevel optimization Tommaso Giovannelli G. Kent Luis Nunes Vicente 74 0 0 11 May 2025
Entropy-Guided Sampling of Flat Modes in Discrete Spaces Pinaki Mohanty Riddhiman Bhattacharya Ruqi Zhang 440 0 0 05 May 2025
Online Functional Principal Component Analysis on a Multidimensional Domain Muye Nanshan Nan Zhang Jiguo Cao 43 0 0 04 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization Shunxian Gu Chaoqun You Bangbang Ren Lailong Luo Junxu Xia Deke Guo 72 0 0 02 May 2025
OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents Raghav Thind Youran Sun Ling Liang Haizhao Yang LLMAG 215 0 0 23 Apr 2025
AlphaGrad: Non-Linear Gradient Normalization Optimizer Soham Sane ODL 147 0 0 22 Apr 2025
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design Zimo Yan Jie Zhang Zheng Xie Chang-rui Liu Yang Liu Yiping Song 115 0 0 22 Apr 2025
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning Xinye Chen 96 0 0 19 Apr 2025
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression Yixuan Zhang Dongyan Yudong Chen Qiaomin Xie 60 0 0 11 Apr 2025
Decentralized Federated Domain Generalization with Style Sharing: A Formal Modeling and Convergence Analysis Shahryar Zehtabi Dong-Jun Han Seyyedali Hosseinalipour Christopher G. Brinton FedML AI4CE 136 0 0 08 Apr 2025
Universal Collection of Euclidean Invariants between Pairs of Position-Orientations Gijs Bellaard B. Smets R. Duits 131 0 0 04 Apr 2025
Approximate Agreement Algorithms for Byzantine Collaborative Learning Tijana Milentijević Mélanie Cambus Darya Melnyk Stefan Schmid FedML 148 1 0 02 Apr 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation Dahun Shin Dongyeop Lee Jinseok Chung Namhoon Lee ODL AAML 517 0 0 25 Feb 2025
Convergence of Shallow ReLU Networks on Weakly Interacting Data Léo Dana Francis R. Bach Loucas Pillaud-Vivien MLT 95 2 0 24 Feb 2025
Verification and Validation for Trustworthy Scientific Machine Learning John D. Jakeman Lorena A. Barba J. Martins Thomas O'Leary-Roseberry AI4CE 138 1 0 21 Feb 2025
Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions Zhaoxian Wu Quan Xian Tayfun Gokmen Omobayode Fagbohungbe Tianyi Chen 172 0 0 17 Feb 2025
Preconditioned Inexact Stochastic ADMM for Deep Model Shenglong Zhou Ouya Wang Ziyan Luo Yongxu Zhu Geoffrey Ye Li 88 0 0 15 Feb 2025
PBM-VFL: Vertical Federated Learning with Feature and Sample Privacy Linh Tran Timothy Castiglia Stacy Patterson Ana Milanova FedML 109 0 0 23 Jan 2025
Celo: Training Versatile Learned Optimizers on a Compute Diet A. Moudgil Boris Knyazev Guillaume Lajoie Eugene Belilovsky 446 0 0 22 Jan 2025
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis Ruichen Luo Sebastian U Stich Samuel Horváth Martin Takáč 142 0 0 08 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism Tim Tsz-Kit Lau Weijian Li Chenwei Xu Han Liu Mladen Kolar 466 0 0 30 Dec 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization Corrado Coppola Lorenzo Papa Irene Amerini L. Palagi ODL 127 0 0 24 Nov 2024
Hierarchical mixtures of Unigram models for short text clustering: The role of Beta-Liouville priors Massimo Bilancia Samuele Magro 101 0 0 29 Oct 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD Florian Hübler Ilyas Fatkhullin Niao He 113 10 0 17 Oct 2024
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees Aleksandar Armacki Shuhua Yu Pranay Sharma Gauri Joshi Dragana Bajović D. Jakovetić S. Kar 117 2 0 17 Oct 2024
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering Joris Postmus Steven Abreu LLMSV 370 3 0 09 Oct 2024
Extended convexity and smoothness and their applications in deep learning Binchuan Qi Wei Gong Li Li 117 0 0 08 Oct 2024
An Attention-Based Algorithm for Gravity Adaptation Zone Calibration Chen Yu 59 0 0 06 Oct 2024
Asymmetry of the Relative Entropy in the Regularization of Empirical Risk Minimization Francisco Daunas I. Esnaola S. Perlaza H. Vincent Poor 106 3 0 02 Oct 2024
SetPINNs: Set-based Physics-informed Neural Networks Mayank Nagda Phil Ostheimer Thomas Specht Frank Rhein Fabian Jirasek Stephan Mandt Marius Kloft Sophie Fellenz PINN 3DPC 201 1 0 30 Sep 2024
Robust Clustering on High-Dimensional Data with Stochastic Quantization Anton Kozyriev Vladimir Norkin MQ 83 3 0 03 Sep 2024
Hierarchical Learning and Computing over Space-Ground Integrated Networks Jingyang Zhu Yuanming Shi Yong Zhou Chunxiao Jiang Linling Kuang 97 2 0 26 Aug 2024
Predicting path-dependent processes by deep learning Xudong Zheng Yuecai Han 42 0 0 19 Aug 2024
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks Md Ferdous Pervej Minseok Choi A. Molisch 99 1 0 12 Aug 2024