ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown
Title
Rethinking LLM Training through Information Geometry and Quantum Metrics
Rethinking LLM Training through Information Geometry and Quantum Metrics
Riccardo Di Sipio
14
0
0
18 Jun 2025
ImprovDML: Improved Trade-off in Private Byzantine-Resilient Distributed Machine Learning
ImprovDML: Improved Trade-off in Private Byzantine-Resilient Distributed Machine Learning
Bing Liu
Chengcheng Zhao
L. Chai
Peng Cheng
Yaonan Wang
FedML
22
0
0
18 Jun 2025
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting
Duc Toan Nguyen
Trang H. Tran
Lam M. Nguyen
22
0
0
14 Jun 2025
Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters
Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters
Mathukumalli Vidyasagar
73
0
0
13 Jun 2025
Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise
Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise
Chuan He
Zhaosong Lu
Defeng Sun
Zhanwang Deng
30
0
0
12 Jun 2025
NDCG-Consistent Softmax Approximation with Accelerated Convergence
Yuanhao Pu
Defu Lian
Xiaolong Chen
Xu Huang
Jin Chen
Enhong Chen
53
0
0
11 Jun 2025
Neural Tangent Kernel Analysis to Probe Convergence in Physics-informed Neural Solvers: PIKANs vs. PINNs
Neural Tangent Kernel Analysis to Probe Convergence in Physics-informed Neural Solvers: PIKANs vs. PINNs
Salah A Faroughi
Farinaz Mostajeran
15
0
0
09 Jun 2025
A Stable Whitening Optimizer for Efficient Neural Network Training
A Stable Whitening Optimizer for Efficient Neural Network Training
Kevin Frans
Sergey Levine
Pieter Abbeel
35
0
0
08 Jun 2025
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Runa Eschenhagen
Aaron Defazio
Tsung-Hsien Lee
Richard Turner
Hao-Jun Michael Shi
93
0
0
04 Jun 2025
Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs
Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs
Riccardo Tenderini
Luca Pegolotti
Fanwei Kong
S. Pagani
Francesco Regazzoni
Alison L. Marsden
S. Deparis
MedImAI4CE
49
0
0
01 Jun 2025
Stationary MMD Points for Cubature
Stationary MMD Points for Cubature
Zonghao Chen
Toni Karvonen
Heishiro Kanagawa
F. Briol
Chris J. Oates
82
0
0
27 May 2025
Moment Expansions of the Energy Distance
Moment Expansions of the Energy Distance
Ian Langmore
17
0
0
27 May 2025
Dynamic Manifold Evolution Theory: Modeling and Stability Analysis of Latent Representations in Large Language Models
Dynamic Manifold Evolution Theory: Modeling and Stability Analysis of Latent Representations in Large Language Models
Yukun Zhang
Qi Dong
AI4CE
26
0
0
24 May 2025
Implicit Neural Shape Optimization for 3D High-Contrast Electrical Impedance Tomography
Implicit Neural Shape Optimization for 3D High-Contrast Electrical Impedance Tomography
Junqing Chen
Haibo Liu
258
0
0
22 May 2025
Never Skip a Batch: Continuous Training of Temporal GNNs via Adaptive Pseudo-Supervision
Never Skip a Batch: Continuous Training of Temporal GNNs via Adaptive Pseudo-Supervision
Alexander Panyshev
Dmitry Vinichenko
Oleg Travkin
Roman Alferov
Alexey Zaytsev
69
0
0
18 May 2025
Dynamic Perturbed Adaptive Method for Infinite Task-Conflicting Time Series
Dynamic Perturbed Adaptive Method for Infinite Task-Conflicting Time Series
Jiang You
Xiaozhen Wang
Arben Cela
AI4TS
82
0
0
17 May 2025
A stochastic gradient method for trilevel optimization
A stochastic gradient method for trilevel optimization
Tommaso Giovannelli
G. Kent
Luis Nunes Vicente
74
0
0
11 May 2025
Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Entropy-Guided Sampling of Flat Modes in Discrete Spaces
Pinaki Mohanty
Riddhiman Bhattacharya
Ruqi Zhang
438
0
0
05 May 2025
Online Functional Principal Component Analysis on a Multidimensional Domain
Online Functional Principal Component Analysis on a Multidimensional Domain
Muye Nanshan
Nan Zhang
Jiguo Cao
43
0
0
04 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
72
0
0
02 May 2025
OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents
OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents
Raghav Thind
Youran Sun
Ling Liang
Haizhao Yang
LLMAG
215
0
0
23 Apr 2025
AlphaGrad: Non-Linear Gradient Normalization Optimizer
AlphaGrad: Non-Linear Gradient Normalization Optimizer
Soham Sane
ODL
147
0
0
22 Apr 2025
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design
MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design
Zimo Yan
Jie Zhang
Zheng Xie
Chang-rui Liu
Yang Liu
Yiping Song
115
0
0
22 Apr 2025
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning
Xinye Chen
96
0
0
19 Apr 2025
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression
A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile Regression
Yixuan Zhang
Dongyan
Yudong Chen
Qiaomin Xie
60
0
0
11 Apr 2025
Decentralized Federated Domain Generalization with Style Sharing: A Formal Modeling and Convergence Analysis
Decentralized Federated Domain Generalization with Style Sharing: A Formal Modeling and Convergence Analysis
Shahryar Zehtabi
Dong-Jun Han
Seyyedali Hosseinalipour
Christopher G. Brinton
FedMLAI4CE
136
0
0
08 Apr 2025
Universal Collection of Euclidean Invariants between Pairs of Position-Orientations
Universal Collection of Euclidean Invariants between Pairs of Position-Orientations
Gijs Bellaard
B. Smets
R. Duits
131
0
0
04 Apr 2025
Approximate Agreement Algorithms for Byzantine Collaborative Learning
Approximate Agreement Algorithms for Byzantine Collaborative Learning
Tijana Milentijević
Mélanie Cambus
Darya Melnyk
Stefan Schmid
FedML
148
1
0
02 Apr 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODLAAML
517
0
0
25 Feb 2025
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Léo Dana
Francis R. Bach
Loucas Pillaud-Vivien
MLT
95
2
0
24 Feb 2025
Verification and Validation for Trustworthy Scientific Machine Learning
Verification and Validation for Trustworthy Scientific Machine Learning
John D. Jakeman
Lorena A. Barba
J. Martins
Thomas O'Leary-Roseberry
AI4CE
138
1
0
21 Feb 2025
Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
Zhaoxian Wu
Quan Xian
Tayfun Gokmen
Omobayode Fagbohungbe
Tianyi Chen
172
0
0
17 Feb 2025
Preconditioned Inexact Stochastic ADMM for Deep Model
Shenglong Zhou
Ouya Wang
Ziyan Luo
Yongxu Zhu
Geoffrey Ye Li
88
0
0
15 Feb 2025
PBM-VFL: Vertical Federated Learning with Feature and Sample Privacy
PBM-VFL: Vertical Federated Learning with Feature and Sample Privacy
Linh Tran
Timothy Castiglia
Stacy Patterson
Ana Milanova
FedML
109
0
0
23 Jan 2025
Celo: Training Versatile Learned Optimizers on a Compute Diet
Celo: Training Versatile Learned Optimizers on a Compute Diet
A. Moudgil
Boris Knyazev
Guillaume Lajoie
Eugene Belilovsky
446
0
0
22 Jan 2025
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Ruichen Luo
Sebastian U Stich
Samuel Horváth
Martin Takáč
142
0
0
08 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
466
0
0
30 Dec 2024
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for
  large-scale optimization
Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization
Corrado Coppola
Lorenzo Papa
Irene Amerini
L. Palagi
ODL
127
0
0
24 Nov 2024
Hierarchical mixtures of Unigram models for short text clustering: The role of Beta-Liouville priors
Hierarchical mixtures of Unigram models for short text clustering: The role of Beta-Liouville priors
Massimo Bilancia
Samuele Magro
101
0
0
29 Oct 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
113
10
0
17 Oct 2024
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
117
2
0
17 Oct 2024
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus
Steven Abreu
LLMSV
370
3
0
09 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
117
0
0
08 Oct 2024
An Attention-Based Algorithm for Gravity Adaptation Zone Calibration
An Attention-Based Algorithm for Gravity Adaptation Zone Calibration
Chen Yu
59
0
0
06 Oct 2024
Asymmetry of the Relative Entropy in the Regularization of Empirical Risk Minimization
Asymmetry of the Relative Entropy in the Regularization of Empirical Risk Minimization
Francisco Daunas
I. Esnaola
S. Perlaza
H. Vincent Poor
106
3
0
02 Oct 2024
SetPINNs: Set-based Physics-informed Neural Networks
SetPINNs: Set-based Physics-informed Neural Networks
Mayank Nagda
Phil Ostheimer
Thomas Specht
Frank Rhein
Fabian Jirasek
Stephan Mandt
Marius Kloft
Sophie Fellenz
PINN3DPC
201
1
0
30 Sep 2024
Robust Clustering on High-Dimensional Data with Stochastic Quantization
Robust Clustering on High-Dimensional Data with Stochastic Quantization
Anton Kozyriev
Vladimir Norkin
MQ
83
3
0
03 Sep 2024
Hierarchical Learning and Computing over Space-Ground Integrated Networks
Hierarchical Learning and Computing over Space-Ground Integrated Networks
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Linling Kuang
97
2
0
26 Aug 2024
Predicting path-dependent processes by deep learning
Predicting path-dependent processes by deep learning
Xudong Zheng
Yuecai Han
42
0
0
19 Aug 2024
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks
Md Ferdous Pervej
Minseok Choi
A. Molisch
99
1
0
12 Aug 2024
1234...161718
Next