Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.11768
Cited By
v1
v2 (latest)
No More Adam: Learning Rate Scaling at Initialization is All You Need
16 December 2024
Minghao Xu
Lichuan Xiang
Xu Cai
Hongkai Wen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"No More Adam: Learning Rate Scaling at Initialization is All You Need"
2 / 2 papers shown
Title
A Minimalist Optimizer Design for LLM Pretraining
Athanasios Glentis
Jiaxiang Li
Andi Han
Mingyi Hong
24
0
0
20 Jun 2025
Gradient Multi-Normalization for Stateless and Scalable LLM Training
M. Scetbon
Chao Ma
Wenbo Gong
Edward Meeds
202
1
0
10 Feb 2025
1