Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.14398
Cited By
Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network
21 March 2024
Zih-Syuan Huang
Ching-pei Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network"
5 / 5 papers shown
Title
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Frederik Kunstner
Jacques Chen
J. Lavington
Mark W. Schmidt
40
67
0
27 Apr 2023
Training Structured Neural Networks Through Manifold Identification and Variance Reduction
Zih-Syuan Huang
Ching-pei Lee
AAML
48
9
0
05 Dec 2021
Dual Averaging is Surprisingly Effective for Deep Learning Optimization
Samy Jelassi
Aaron Defazio
28
4
0
20 Oct 2020
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
143
0
05 Mar 2020
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
1