No More Adam: Learning Rate Scaling at Initialization is All You Need
v1v2 (latest)

No More Adam: Learning Rate Scaling at Initialization is All You Need

Papers citing "No More Adam: Learning Rate Scaling at Initialization is All You Need"

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.