Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.23737
Cited By
On the Convergence Analysis of Muon
29 May 2025
Wei Shen
Ruichuan Huang
Minhui Huang
Cong Shen
Jiawei Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On the Convergence Analysis of Muon"
6 / 6 papers shown
Title
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Zhi-Quan Luo
Jianfeng Yao
Ruoyu Sun
77
1
0
05 May 2025
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Boyao Wang
Shiqian Ma
Shiqian Ma
Tong Zhang
Tong Zhang
ODL
155
5
0
26 Mar 2025
Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization
Dmitry Kovalev
141
5
0
16 Mar 2025
Structured Preconditioners in Adaptive Optimization: A Unified Analysis
Shuo Xie
Tianhao Wang
Sashank J. Reddi
Sanjiv Kumar
Zhiyuan Li
84
4
0
13 Mar 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
411
3
0
24 Feb 2025
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
167
38
0
17 Sep 2024
1