ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.06589
  4. Cited By
Do Generative Large Language Models need billions of parameters?

Do Generative Large Language Models need billions of parameters?

12 September 2023
Sia Gholami
Marwan Omar
ArXivPDFHTML

Papers citing "Do Generative Large Language Models need billions of parameters?"

9 / 9 papers shown
Title
Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension
Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension
Anh Duc Nguyen
Hieu Minh Phi
Anh Viet Ngo
Long Hai Trieu
Thai Nguyen
56
0
0
23 Mar 2025
Exploring the landscape of large language models: Foundations,
  techniques, and challenges
Exploring the landscape of large language models: Foundations, techniques, and challenges
M. Moradi
Ke Yan
David Colwell
Matthias Samwald
Rhona Asgari
OffRL
46
1
0
18 Apr 2024
Does Synthetic Data Make Large Language Models More Efficient?
Does Synthetic Data Make Large Language Models More Efficient?
Sia Gholami
Marwan Omar
30
12
0
11 Oct 2023
Can pruning make Large Language Models more efficient?
Can pruning make Large Language Models more efficient?
Sia Gholami
Marwan Omar
28
12
0
06 Oct 2023
Can a student Large Language Model perform as well as it's teacher?
Can a student Large Language Model perform as well as it's teacher?
Sia Gholami
Marwan Omar
18
11
0
03 Oct 2023
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
285
2,015
0
28 Jul 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Text Summarization with Pretrained Encoders
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
258
1,432
0
22 Aug 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1