ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,406 papers shown
Title
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
Zahra Zamanzadeh Darban
Yiyuan Yang
Geoffrey I. Webb
Charu C. Aggarwal
Qingsong Wen
Shirui Pan
Mahsa Salehi
60
0
0
17 Apr 2024
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Noah Lewis
J. L. Bez
Suren Byna
57
0
0
16 Apr 2024
Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of
  Overfitting
Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of Overfitting
Nathaniel R. Dean
Dilip Sarkar
AAML
28
0
0
10 Apr 2024
Unifying Low Dimensional Observations in Deep Learning Through the Deep
  Linear Unconstrained Feature Model
Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod
Jonathan P. Keating
41
8
0
09 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
M. Zeilinger
Michael Muehlebach
50
0
0
08 Apr 2024
A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network
A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network
Zhiqiang Cai
Tong Ding
Min Liu
Xinyu Liu
Jianlin Xia
173
2
0
07 Apr 2024
Optimal Batch Allocation for Wireless Federated Learning
Optimal Batch Allocation for Wireless Federated Learning
Jaeyoung Song
Sang-Woon Jeon
31
0
0
03 Apr 2024
Satellite Federated Edge Learning: Architecture Design and Convergence
  Analysis
Satellite Federated Edge Learning: Architecture Design and Convergence Analysis
Yuanming Shi
Li Zeng
Jingyang Zhu
Yong Zhou
Chunxiao Jiang
Khaled B. Letaief
33
12
0
02 Apr 2024
What Can Transformer Learn with Varying Depth? Case Studies on Sequence
  Learning Tasks
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen
Difan Zou
ViT
26
12
0
02 Apr 2024
DRIVE: Dual Gradient-Based Rapid Iterative Pruning
DRIVE: Dual Gradient-Based Rapid Iterative Pruning
Dhananjay Saikumar
Blesson Varghese
30
0
0
01 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
4
0
01 Apr 2024
Communication Efficient Distributed Training with Distributed Lion
Communication Efficient Distributed Training with Distributed Lion
Bo Liu
Lemeng Wu
Lizhang Chen
Kaizhao Liang
Jiaxu Zhu
Chen Liang
Raghuraman Krishnamoorthi
Qiang Liu
32
6
0
30 Mar 2024
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded
  Graph Neural Networks
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
Yongyi Yang
Jiaming Yang
Wei Hu
Michal Dereziñski
48
0
0
26 Mar 2024
AI and Memory Wall
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
27
143
0
21 Mar 2024
PETScML: Second-order solvers for training regression problems in
  Scientific Machine Learning
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Stefano Zampini
Umberto Zerbinati
George Turkyyiah
David E. Keyes
43
4
0
18 Mar 2024
Nonsmooth Implicit Differentiation: Deterministic and Stochastic
  Convergence Rates
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
Riccardo Grazzi
Massimiliano Pontil
Saverio Salzo
49
1
0
18 Mar 2024
A Selective Review on Statistical Methods for Massive Data Computation:
  Distributed Computing, Subsampling, and Minibatch Techniques
A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques
Xuetong Li
Yuan Gao
Hong Chang
Danyang Huang
Yingying Ma
...
Ke Xu
Jing Zhou
Xuening Zhu
Yingqiu Zhu
Hansheng Wang
44
7
0
17 Mar 2024
Streamlining in the Riemannian Realm: Efficient Riemannian Optimization
  with Loopless Variance Reduction
Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction
Yury Demidovich
Grigory Malinovsky
Peter Richtárik
56
2
0
11 Mar 2024
Shuffling Momentum Gradient Algorithm for Convex Optimization
Shuffling Momentum Gradient Algorithm for Convex Optimization
Trang H. Tran
Quoc Tran-Dinh
Lam M. Nguyen
23
1
0
05 Mar 2024
SOFIM: Stochastic Optimization Using Regularized Fisher Information
  Matrix
SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix
Mrinmay Sen
A. K. Qin
Gayathri C
Raghu Kishore N
Yen-Wei Chen
Balasubramanian Raman
39
1
0
05 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus
  Generalizability of Stochastic Optimization Algorithms
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
32
0
0
01 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
22
3
0
28 Feb 2024
Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Patrick Pynadath
Riddhiman Bhattacharya
Arun Hariharan
Ruqi Zhang
33
3
0
27 Feb 2024
Efficient Backpropagation with Variance-Controlled Adaptive Sampling
Efficient Backpropagation with Variance-Controlled Adaptive Sampling
Ziteng Wang
Jianfei Chen
Jun Zhu
BDL
40
2
0
27 Feb 2024
On the connection between Noise-Contrastive Estimation and Contrastive
  Divergence
On the connection between Noise-Contrastive Estimation and Contrastive Divergence
Amanda Olmin
Jakob Lindqvist
Lennart Svensson
Fredrik Lindsten
35
0
0
26 Feb 2024
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
Dhananjay Saikumar
Blesson Varghese
24
1
0
21 Feb 2024
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Yusu Hong
Junhong Lin
28
12
0
21 Feb 2024
SGD with Clipping is Secretly Estimating the Median Gradient
SGD with Clipping is Secretly Estimating the Median Gradient
Fabian Schaipp
Guillaume Garrigos
Umut Simsekli
Robert M. Gower
19
0
0
20 Feb 2024
Byzantine-Robust Federated Learning: Impact of Client Subsampling and
  Local Updates
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
Youssef Allouah
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Geovani Rizk
S. Voitovych
FedML
30
4
0
20 Feb 2024
OptEx: Expediting First-Order Optimization with Approximately
  Parallelized Iterations
OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
Yao Shu
Jiongfeng Fang
Y. He
Fei Richard Yu
27
0
0
18 Feb 2024
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
Tim Tsz-Kit Lau
Han Liu
Mladen Kolar
ODL
32
6
0
17 Feb 2024
An Accelerated Distributed Stochastic Gradient Method with Momentum
An Accelerated Distributed Stochastic Gradient Method with Momentum
Kun-Yen Huang
Shi Pu
Angelia Nedić
35
8
0
15 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
34
0
0
13 Feb 2024
Preconditioners for the Stochastic Training of Implicit Neural
  Representations
Preconditioners for the Stochastic Training of Implicit Neural Representations
Shin-Fang Chng
Hemanth Saratchandran
Simon Lucey
26
0
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
32
7
0
12 Feb 2024
Accelerating Distributed Deep Learning using Lossless Homomorphic
  Compression
Accelerating Distributed Deep Learning using Lossless Homomorphic Compression
Haoyu Li
Yuchen Xu
Jiayi Chen
Rohit Dwivedula
Wenfei Wu
Keqiang He
Aditya Akella
Daehyeok Kim
FedML
AI4CE
23
4
0
12 Feb 2024
Scalable Kernel Logistic Regression with Nyström Approximation:
  Theoretical Analysis and Application to Discrete Choice Modelling
Scalable Kernel Logistic Regression with Nyström Approximation: Theoretical Analysis and Application to Discrete Choice Modelling
José Ángel Martín-Baos
Ricardo García-Ródenas
Luis Rodriguez-Benitez
Michel Bierlaire
23
1
0
09 Feb 2024
Feed-Forward Neural Networks as a Mixed-Integer Program
Feed-Forward Neural Networks as a Mixed-Integer Program
Navid Aftabi
Nima Moradi
Fatemeh Mahroo
24
2
0
09 Feb 2024
An Inexact Halpern Iteration with Application to Distributionally Robust
  Optimization
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Ling Liang
Kim-Chuan Toh
Jia Jie Zhu
32
4
0
08 Feb 2024
On the Convergence of Zeroth-Order Federated Tuning for Large Language
  Models
On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
Zhenqing Ling
Daoyuan Chen
Liuyi Yao
Yaliang Li
Ying Shen
FedML
47
12
0
08 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
11
0
06 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A
  Second-Order Perspective
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard Turner
Alireza Makhzani
ODL
54
12
0
05 Feb 2024
Ginger: An Efficient Curvature Approximation with Linear Complexity for
  General Neural Networks
Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
ODL
38
1
0
05 Feb 2024
Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement
  Learning Using Unique Experiences
Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences
Nikhil Kumar Singh
Indranil Saha
OffRL
14
0
0
05 Feb 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
45
1
0
05 Feb 2024
Incremental Quasi-Newton Methods with Faster Superlinear Convergence
  Rates
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates
Zhuanghua Liu
Luo Luo
K. H. Low
16
2
0
04 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
16
1
0
02 Feb 2024
Improved Quantization Strategies for Managing Heavy-tailed Gradients in
  Distributed Learning
Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Hanxu Hou
Linqi Song
MQ
29
0
0
02 Feb 2024
Truncated Non-Uniform Quantization for Distributed SGD
Truncated Non-Uniform Quantization for Distributed SGD
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Congduan Li
Linqi Song
MQ
12
0
0
02 Feb 2024
Previous
12345...272829
Next