Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 866 papers shown
Title
LiD-FL: Towards List-Decodable Federated Learning
Hong Liu
Liren Shan
Han Bao
Ronghui You
Yuhao Yi
Jiancheng Lv
FedML
185
0
0
09 Aug 2024
Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models
Matteo Lapucci
Davide Pucci
96
1
0
06 Aug 2024
Differentially Private Block-wise Gradient Shuffle for Deep Learning
Zilong Zhang
FedML
108
0
0
31 Jul 2024
Adaptive Mix for Semi-Supervised Medical Image Segmentation
Zhiqiang Shen
Peng Cao
Junming Su
Jinzhu Yang
Osmar R. Zaiane
154
0
0
31 Jul 2024
Many Perception Tasks are Highly Redundant Functions of their Input Data
Rahul Ramesh
Anthony Bisulco
Ronald W. DiTullio
Linran Wei
Vijay Balasubramanian
Kostas Daniilidis
Pratik Chaudhari
122
2
0
18 Jul 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
84
4
0
05 Jul 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
162
0
0
11 Jun 2024
Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization
Devyani Maladkar
Ruichen Jiang
Aryan Mokhtari
104
6
0
07 Jun 2024
Demystifying SGD with Doubly Stochastic Gradients
Kyurae Kim
Joohwan Ko
Yian Ma
Jacob R. Gardner
147
2
0
03 Jun 2024
Symmetries in Overparametrized Neural Networks: A Mean-Field View
Javier Maass
Joaquin Fontbona
MLT
FedML
186
2
0
30 May 2024
A Pontryagin Perspective on Reinforcement Learning
Onno Eberhard
Claire Vernade
Michael Muehlebach
122
3
0
28 May 2024
Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks
Felix Dangel
Johannes Müller
Marius Zeinhofer
ODL
107
11
0
24 May 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
93
3
0
22 May 2024
Reinforcement learning
Florentin Wörgötter
94
2,526
0
16 May 2024
A Full Adagrad algorithm with O(Nd) operations
Antoine Godichon-Baggioni
Wei Lu
Bruno Portier
ODL
136
0
0
03 May 2024
Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
Sachin Garg
A. Berahas
Michal Dereziñski
92
1
0
23 Apr 2024
Rate Analysis of Coupled Distributed Stochastic Approximation for Misspecified Optimization
Yaqun Yang
Jinlong Lei
70
0
0
21 Apr 2024
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
Zahra Zamanzadeh Darban
Yiyuan Yang
Geoffrey I. Webb
Charu C. Aggarwal
Qingsong Wen
Xiaojun Jia
Mahsa Salehi
127
1
0
17 Apr 2024
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Noah Lewis
J. L. Bez
Suren Byna
113
0
0
16 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
Melanie Zeilinger
Michael Muehlebach
92
1
0
08 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
140
7
0
01 Apr 2024
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
Yongyi Yang
Jiaming Yang
Wei Hu
Michal Dereziñski
90
0
0
26 Mar 2024
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
84
161
0
21 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
63
0
0
01 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
84
3
0
28 Feb 2024
OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
Yao Shu
Jiongfeng Fang
Y. He
Fei Richard Yu
66
0
0
18 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
66
6
0
14 Feb 2024
Scalable Kernel Logistic Regression with Nyström Approximation: Theoretical Analysis and Application to Discrete Choice Modelling
José Ángel Martín-Baos
Ricardo García-Ródenas
Luis Rodriguez-Benitez
Michel Bierlaire
50
1
0
09 Feb 2024
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Ling Liang
Zusen Xu
Kim-Chuan Toh
Jia Jie Zhu
152
4
0
08 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
129
13
0
06 Feb 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
109
2
0
05 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
61
1
0
02 Feb 2024
Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients
Guangfeng Yan
Shanxiang Lyu
Hanxu Hou
Zhiyong Zheng
Linqi Song
FedML
20
1
0
02 Feb 2024
Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods
Xiliang Yang
Yifei Xiong
Zhijian He
84
0
0
30 Jan 2024
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
Rajeeva Laxman Karandikar
M. Vidyasagar
66
10
0
05 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
118
4
0
29 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
50
0
0
27 Nov 2023
Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia
Milad Baghalzadeh Shishehgarkhaneh
R. Moehler
Yihai Fang
Amer A. Hijazi
Hamed Aboutorab
100
10
0
23 Nov 2023
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
91
2
0
20 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
46
0
0
31 Oct 2023
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
54
1
0
28 Oct 2023
Performative Prediction: Past and Future
Moritz Hardt
Celestine Mendler-Dünner
159
26
0
25 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
75
4
0
21 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
80
9
0
20 Oct 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
57
6
0
28 Sep 2023
A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications
Christian Varner
Vivak Patel
71
2
0
19 Sep 2023
Derivation of Coordinate Descent Algorithms from Optimal Control Theory
I. Michael Ross
27
1
0
07 Sep 2023
We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond
A. Khadangi
ODL
109
0
0
21 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
85
3
0
18 Aug 2023
Quantile Optimization via Multiple Timescale Local Search for Black-box Functions
Jiaqiao Hu
Meichen Song
Michael Fu
28
6
0
15 Aug 2023
Previous
1
2
3
4
5
...
16
17
18
Next