Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

31 December 2023

Papers citing "Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws"

8 / 58 papers shown

Title
Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress Ameya Prabhu Vishaal Udandarao Philip Torr Matthias Bethge Adel Bibi Samuel Albanie 42 5 0 29 Feb 2024
Scaling Laws for Fine-Grained Mixture of Experts Jakub Krajewski Jan Ludziejewski Kamil Adamczewski Maciej Pióro Michal Krutul ... Krystian Król Tomasz Odrzygó'zd'z Piotr Sankowski Marek Cygan Sebastian Jaszczur MoE 51 54 0 12 Feb 2024
A Dynamical Model of Neural Scaling Laws Blake Bordelon Alexander B. Atanasov Cengiz Pehlevan 51 36 0 02 Feb 2024
CroissantLLM: A Truly Bilingual French-English Language Model Manuel Faysse Patrick Fernandes Nuno M. Guerreiro António Loison Duarte M. Alves ... François Yvon André F.T. Martins Gautier Viaud C´eline Hudelot Pierre Colombo 55 32 0 01 Feb 2024
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed? Tannon Kew Florian Schottmann Rico Sennrich LRM 28 36 0 20 Dec 2023
Will we run out of data? Limits of LLM scaling based on human-generated data Pablo Villalobos A. Ho J. Sevilla T. Besiroglu Lennart Heim Marius Hobbhahn ALM 38 111 0 26 Oct 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation Ofir Press Noah A. Smith M. Lewis 253 698 0 27 Aug 2021
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 264 4,489 0 23 Jan 2020