ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 866 papers shown
Title
LiD-FL: Towards List-Decodable Federated Learning
LiD-FL: Towards List-Decodable Federated Learning
Hong Liu
Liren Shan
Han Bao
Ronghui You
Yuhao Yi
Jiancheng Lv
FedML
185
0
0
09 Aug 2024
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Matteo Lapucci
Davide Pucci
96
1
0
06 Aug 2024
Differentially Private Block-wise Gradient Shuffle for Deep Learning
Differentially Private Block-wise Gradient Shuffle for Deep Learning
Zilong Zhang
FedML
108
0
0
31 Jul 2024
Adaptive Mix for Semi-Supervised Medical Image Segmentation
Adaptive Mix for Semi-Supervised Medical Image Segmentation
Zhiqiang Shen
Peng Cao
Junming Su
Jinzhu Yang
Osmar R. Zaiane
154
0
0
31 Jul 2024
Many Perception Tasks are Highly Redundant Functions of their Input Data
Many Perception Tasks are Highly Redundant Functions of their Input Data
Rahul Ramesh
Anthony Bisulco
Ronald W. DiTullio
Linran Wei
Vijay Balasubramanian
Kostas Daniilidis
Pratik Chaudhari
122
2
0
18 Jul 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton
  Stepsizes
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
84
4
0
05 Jul 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
162
0
0
11 Jun 2024
Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
Devyani Maladkar
Ruichen Jiang
Aryan Mokhtari
104
6
0
07 Jun 2024
Demystifying SGD with Doubly Stochastic Gradients
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim
Joohwan Ko
Yian Ma
Jacob R. Gardner
147
2
0
03 Jun 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Javier Maass
Joaquin Fontbona
MLTFedML
186
2
0
30 May 2024
A Pontryagin Perspective on Reinforcement Learning
A Pontryagin Perspective on Reinforcement Learning
Onno Eberhard
Claire Vernade
Michael Muehlebach
122
3
0
28 May 2024
Kronecker-Factored Approximate Curvature for Physics-Informed Neural
  Networks
Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks
Felix Dangel
Johannes Müller
Marius Zeinhofer
ODL
107
11
0
24 May 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
93
3
0
22 May 2024
Reinforcement learning
Reinforcement learning
Florentin Wörgötter
97
2,526
0
16 May 2024
A Full Adagrad algorithm with O(Nd) operations
A Full Adagrad algorithm with O(Nd) operations
Antoine Godichon-Baggioni
Wei Lu
Bruno Portier
ODL
136
0
0
03 May 2024
Second-order Information Promotes Mini-Batch Robustness in
  Variance-Reduced Gradients
Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
Sachin Garg
A. Berahas
Michal Dereziñski
92
1
0
23 Apr 2024
Rate Analysis of Coupled Distributed Stochastic Approximation for
  Misspecified Optimization
Rate Analysis of Coupled Distributed Stochastic Approximation for Misspecified Optimization
Yaqun Yang
Jinlong Lei
70
0
0
21 Apr 2024
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
Zahra Zamanzadeh Darban
Yiyuan Yang
Geoffrey I. Webb
Charu C. Aggarwal
Qingsong Wen
Xiaojun Jia
Mahsa Salehi
127
1
0
17 Apr 2024
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Noah Lewis
J. L. Bez
Suren Byna
113
0
0
16 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
Melanie Zeilinger
Michael Muehlebach
92
1
0
08 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
140
7
0
01 Apr 2024
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded
  Graph Neural Networks
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
Yongyi Yang
Jiaming Yang
Wei Hu
Michal Dereziñski
90
0
0
26 Mar 2024
AI and Memory Wall
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
84
161
0
21 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus
  Generalizability of Stochastic Optimization Algorithms
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
63
0
0
01 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
84
3
0
28 Feb 2024
OptEx: Expediting First-Order Optimization with Approximately
  Parallelized Iterations
OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
Yao Shu
Jiongfeng Fang
Y. He
Fei Richard Yu
66
0
0
18 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
66
6
0
14 Feb 2024
Scalable Kernel Logistic Regression with Nyström Approximation:
  Theoretical Analysis and Application to Discrete Choice Modelling
Scalable Kernel Logistic Regression with Nyström Approximation: Theoretical Analysis and Application to Discrete Choice Modelling
José Ángel Martín-Baos
Ricardo García-Ródenas
Luis Rodriguez-Benitez
Michel Bierlaire
50
1
0
09 Feb 2024
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Ling Liang
Zusen Xu
Kim-Chuan Toh
Jia Jie Zhu
152
4
0
08 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
129
13
0
06 Feb 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
109
2
0
05 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
61
1
0
02 Feb 2024
Towards Quantum-Safe Federated Learning via Homomorphic Encryption:
  Learning with Gradients
Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients
Guangfeng Yan
Shanxiang Lyu
Hanxu Hou
Zhiyong Zheng
Linqi Song
FedML
20
1
0
02 Feb 2024
Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods
Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods
Xiliang Yang
Yifei Xiong
Zhijian He
84
0
0
30 Jan 2024
Convergence Rates for Stochastic Approximation: Biased Noise with
  Unbounded Variance, and Applications
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
Rajeeva Laxman Karandikar
M. Vidyasagar
66
10
0
05 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's
  Method with O(dN) Operations
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
118
4
0
29 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
50
0
0
27 Nov 2023
Transformer-based Named Entity Recognition in Construction Supply Chain
  Risk Management in Australia
Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia
Milad Baghalzadeh Shishehgarkhaneh
R. Moehler
Yihai Fang
Amer A. Hijazi
Hamed Aboutorab
100
10
0
23 Nov 2023
High Probability Guarantees for Random Reshuffling
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
91
2
0
20 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based
  Optimization
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
46
0
0
31 Oct 2023
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
54
1
0
28 Oct 2023
Performative Prediction: Past and Future
Performative Prediction: Past and Future
Moritz Hardt
Celestine Mendler-Dünner
159
26
0
25 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
75
4
0
21 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency
  for Federated Learning with Static and Streaming Dataset
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
80
9
0
20 Oct 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
High Throughput Training of Deep Surrogates from Large Ensemble Runs
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
57
6
0
28 Sep 2023
A Novel Gradient Methodology with Economical Objective Function
  Evaluations for Data Science Applications
A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications
Christian Varner
Vivak Patel
71
2
0
19 Sep 2023
Derivation of Coordinate Descent Algorithms from Optimal Control Theory
Derivation of Coordinate Descent Algorithms from Optimal Control Theory
I. Michael Ross
27
1
0
07 Sep 2023
We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual
  Learning Rate And Beyond
We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond
A. Khadangi
ODL
109
0
0
21 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
85
3
0
18 Aug 2023
Quantile Optimization via Multiple Timescale Local Search for Black-box
  Functions
Quantile Optimization via Multiple Timescale Local Search for Black-box Functions
Jiaqiao Hu
Meichen Song
Michael Fu
28
6
0
15 Aug 2023
Previous
12345...161718
Next