Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.02312
Cited By
Training trajectories, mini-batch losses and the curious role of the learning rate
5 January 2023
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Nolan Miller
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training trajectories, mini-batch losses and the curious role of the learning rate"
3 / 3 papers shown
Title
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
20
41
0
12 Jul 2023
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
255
314
0
11 Sep 2022
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
1