Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.13359
Cited By
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler
23 August 2024
Yikang Shen
Matthew Stallone
Mayank Mishra
Gaoyuan Zhang
Shawn Tan
Aditya Prasad
Adriana Meza Soria
David D. Cox
Rameswar Panda
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler"
4 / 4 papers shown
Title
xGen-small Technical Report
Erik Nijkamp
Bo Pang
Egor Pakhomov
Akash Gokul
Jin Qu
Silvio Savarese
Yingbo Zhou
Caiming Xiong
LLMAG
58
0
0
10 May 2025
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
125
1
0
07 Mar 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
36
3
0
10 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
124
2
0
03 Jan 2025
1