ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.06131
  4. Cited By
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
v1v2 (latest)

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review

10 September 2024
Neha Prakriya
Jui-Nan Yen
Cho-Jui Hsieh
Jason Cong
    KELMAI4CELRM
ArXiv (abs)PDFHTML

Papers citing "Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review"

16 / 16 papers shown
Title
LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning
LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning
Neha Prakriya
Zijian Ding
Yizhou Sun
Jason Cong
64
1
0
29 Apr 2025
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
  with Web Data, and Web Data Only
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Guilherme Penedo
Quentin Malartic
Daniel Hesslow
Ruxandra-Aimée Cojocaru
Alessandro Cappelli
Hamza Alobeidli
B. Pannier
Ebtesam Almazrouei
Julien Launay
128
775
0
01 Jun 2023
The MiniPile Challenge for Data-Efficient Language Models
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoEALM
113
46
0
17 Apr 2023
SemDeDup: Data-efficient learning at web-scale through semantic
  deduplication
SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Amro Abbas
Kushal Tirumala
Daniel Simig
Surya Ganguli
Ari S. Morcos
75
182
0
16 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.5K
13,437
0
27 Feb 2023
Data Selection for Language Models via Importance Resampling
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
118
193
0
06 Feb 2023
Training Trajectories of Language Models Across Scales
Training Trajectories of Language Models Across Scales
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFinLRM
85
64
0
19 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
246
622
0
07 Dec 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
362
3,699
0
02 May 2022
Deep Learning on a Data Diet: Finding Important Examples Early in
  Training
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
119
461
0
15 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
475
2,121
0
31 Dec 2020
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
184
4,572
0
07 Sep 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
880
42,379
0
28 May 2020
Unsupervised Neural Machine Translation
Unsupervised Neural Machine Translation
Mikel Artetxe
Gorka Labaka
Eneko Agirre
Kyunghyun Cho
95
772
0
30 Oct 2017
The LAMBADA dataset: Word prediction requiring a broad discourse context
The LAMBADA dataset: Word prediction requiring a broad discourse context
Denis Paperno
Germán Kruszewski
Angeliki Lazaridou
Q. N. Pham
Raffaella Bernardi
Sandro Pezzelle
Marco Baroni
Gemma Boleda
Raquel Fernández
142
727
0
20 Jun 2016
One Billion Word Benchmark for Measuring Progress in Statistical
  Language Modeling
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
190
1,109
0
11 Dec 2013
1