Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,406 papers shown
Title
DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series
Zahra Zamanzadeh Darban
Yiyuan Yang
Geoffrey I. Webb
Charu C. Aggarwal
Qingsong Wen
Shirui Pan
Mahsa Salehi
60
0
0
17 Apr 2024
I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey
Noah Lewis
J. L. Bez
Suren Byna
57
0
0
16 Apr 2024
Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of Overfitting
Nathaniel R. Dean
Dilip Sarkar
AAML
28
0
0
10 Apr 2024
Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod
Jonathan P. Keating
41
8
0
09 Apr 2024
Stochastic Online Optimization for Cyber-Physical and Robotic Systems
Hao Ma
M. Zeilinger
Michael Muehlebach
50
0
0
08 Apr 2024
A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network
Zhiqiang Cai
Tong Ding
Min Liu
Xinyu Liu
Jianlin Xia
173
2
0
07 Apr 2024
Optimal Batch Allocation for Wireless Federated Learning
Jaeyoung Song
Sang-Woon Jeon
31
0
0
03 Apr 2024
Satellite Federated Edge Learning: Architecture Design and Convergence Analysis
Yuanming Shi
Li Zeng
Jingyang Zhu
Yong Zhou
Chunxiao Jiang
Khaled B. Letaief
33
12
0
02 Apr 2024
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen
Difan Zou
ViT
26
12
0
02 Apr 2024
DRIVE: Dual Gradient-Based Rapid Iterative Pruning
Dhananjay Saikumar
Blesson Varghese
30
0
0
01 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
42
4
0
01 Apr 2024
Communication Efficient Distributed Training with Distributed Lion
Bo Liu
Lemeng Wu
Lizhang Chen
Kaizhao Liang
Jiaxu Zhu
Chen Liang
Raghuraman Krishnamoorthi
Qiang Liu
32
6
0
30 Mar 2024
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
Yongyi Yang
Jiaming Yang
Wei Hu
Michal Dereziñski
48
0
0
26 Mar 2024
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
27
143
0
21 Mar 2024
PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Stefano Zampini
Umberto Zerbinati
George Turkyyiah
David E. Keyes
43
4
0
18 Mar 2024
Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates
Riccardo Grazzi
Massimiliano Pontil
Saverio Salzo
49
1
0
18 Mar 2024
A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques
Xuetong Li
Yuan Gao
Hong Chang
Danyang Huang
Yingying Ma
...
Ke Xu
Jing Zhou
Xuening Zhu
Yingqiu Zhu
Hansheng Wang
44
7
0
17 Mar 2024
Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction
Yury Demidovich
Grigory Malinovsky
Peter Richtárik
56
2
0
11 Mar 2024
Shuffling Momentum Gradient Algorithm for Convex Optimization
Trang H. Tran
Quoc Tran-Dinh
Lam M. Nguyen
23
1
0
05 Mar 2024
SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix
Mrinmay Sen
A. K. Qin
Gayathri C
Raghu Kishore N
Yen-Wei Chen
Balasubramanian Raman
39
1
0
05 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
32
0
0
01 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
22
3
0
28 Feb 2024
Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Patrick Pynadath
Riddhiman Bhattacharya
Arun Hariharan
Ruqi Zhang
33
3
0
27 Feb 2024
Efficient Backpropagation with Variance-Controlled Adaptive Sampling
Ziteng Wang
Jianfei Chen
Jun Zhu
BDL
40
2
0
27 Feb 2024
On the connection between Noise-Contrastive Estimation and Contrastive Divergence
Amanda Olmin
Jakob Lindqvist
Lennart Svensson
Fredrik Lindsten
35
0
0
26 Feb 2024
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
Dhananjay Saikumar
Blesson Varghese
24
1
0
21 Feb 2024
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Yusu Hong
Junhong Lin
28
12
0
21 Feb 2024
SGD with Clipping is Secretly Estimating the Median Gradient
Fabian Schaipp
Guillaume Garrigos
Umut Simsekli
Robert M. Gower
19
0
0
20 Feb 2024
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
Youssef Allouah
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Geovani Rizk
S. Voitovych
FedML
30
4
0
20 Feb 2024
OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
Yao Shu
Jiongfeng Fang
Y. He
Fei Richard Yu
27
0
0
18 Feb 2024
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
Tim Tsz-Kit Lau
Han Liu
Mladen Kolar
ODL
32
6
0
17 Feb 2024
An Accelerated Distributed Stochastic Gradient Method with Momentum
Kun-Yen Huang
Shi Pu
Angelia Nedić
35
8
0
15 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
34
0
0
13 Feb 2024
Preconditioners for the Stochastic Training of Implicit Neural Representations
Shin-Fang Chng
Hemanth Saratchandran
Simon Lucey
26
0
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
32
7
0
12 Feb 2024
Accelerating Distributed Deep Learning using Lossless Homomorphic Compression
Haoyu Li
Yuchen Xu
Jiayi Chen
Rohit Dwivedula
Wenfei Wu
Keqiang He
Aditya Akella
Daehyeok Kim
FedML
AI4CE
23
4
0
12 Feb 2024
Scalable Kernel Logistic Regression with Nyström Approximation: Theoretical Analysis and Application to Discrete Choice Modelling
José Ángel Martín-Baos
Ricardo García-Ródenas
Luis Rodriguez-Benitez
Michel Bierlaire
23
1
0
09 Feb 2024
Feed-Forward Neural Networks as a Mixed-Integer Program
Navid Aftabi
Nima Moradi
Fatemeh Mahroo
24
2
0
09 Feb 2024
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Ling Liang
Kim-Chuan Toh
Jia Jie Zhu
32
4
0
08 Feb 2024
On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
Zhenqing Ling
Daoyuan Chen
Liuyi Yao
Yaliang Li
Ying Shen
FedML
47
12
0
08 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
46
11
0
06 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard Turner
Alireza Makhzani
ODL
54
12
0
05 Feb 2024
Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
ODL
38
1
0
05 Feb 2024
Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences
Nikhil Kumar Singh
Indranil Saha
OffRL
14
0
0
05 Feb 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
45
1
0
05 Feb 2024
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates
Zhuanghua Liu
Luo Luo
K. H. Low
16
2
0
04 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
16
1
0
02 Feb 2024
Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Hanxu Hou
Linqi Song
MQ
29
0
0
02 Feb 2024
Truncated Non-Uniform Quantization for Distributed SGD
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Congduan Li
Linqi Song
MQ
12
0
0
02 Feb 2024
Previous
1
2
3
4
5
...
27
28
29
Next