Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size

Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size

Soufiane Hayou
Liyuan Liu

Papers citing "Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size"

Title
No papers