Do Generative Large Language Models need billions of parameters?

12 September 2023

Papers citing "Do Generative Large Language Models need billions of parameters?"

9 / 9 papers shown

Title
Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension Anh Duc Nguyen Hieu Minh Phi Anh Viet Ngo Long Hai Trieu Thai Nguyen 56 0 0 23 Mar 2025
Exploring the landscape of large language models: Foundations, techniques, and challenges M. Moradi Ke Yan David Colwell Matthias Samwald Rhona Asgari OffRL 46 1 0 18 Apr 2024
Does Synthetic Data Make Large Language Models More Efficient? Sia Gholami Marwan Omar 30 12 0 11 Oct 2023
Can pruning make Large Language Models more efficient? Sia Gholami Marwan Omar 28 12 0 06 Oct 2023
Can a student Large Language Model perform as well as it's teacher? Sia Gholami Marwan Omar 18 11 0 03 Oct 2023
Big Bird: Transformers for Longer Sequences Manzil Zaheer Guru Guruganesh Kumar Avinava Dubey Joshua Ainslie Chris Alberti ... Philip Pham Anirudh Ravula Qifan Wang Li Yang Amr Ahmed VLM 285 2,015 0 28 Jul 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 245 1,821 0 17 Sep 2019
Text Summarization with Pretrained Encoders Yang Liu Mirella Lapata MILM 258 1,432 0 22 Aug 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016