ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02170
  4. Cited By
Curriculum learning for language modeling

Curriculum learning for language modeling

4 August 2021
Daniel Fernando Campos
ArXivPDFHTML

Papers citing "Curriculum learning for language modeling"

28 / 28 papers shown
Title
Can an Easy-to-Hard Curriculum Make Reasoning Emerge in Small Language Models? Evidence from a Four-Stage Curriculum on GPT-2
Can an Easy-to-Hard Curriculum Make Reasoning Emerge in Small Language Models? Evidence from a Four-Stage Curriculum on GPT-2
Xiang Fu
ReLM
LRM
12
0
0
16 May 2025
Curriculum Learning-Driven PIELMs for Fluid Flow Simulations
Vikas Dwivedi
Bruno Sixou
Monica Sigovan
PINN
AI4CE
44
0
0
08 Mar 2025
Scaling LLM Pre-training with Vocabulary Curriculum
Scaling LLM Pre-training with Vocabulary Curriculum
Fangyuan Yu
75
2
0
25 Feb 2025
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
Yi Liu
Changran Xu
Yunhao Zhou
Z. Li
Qiang Xu
VLM
51
4
0
20 Feb 2025
FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Xuemiao Zhang
Feiyu Duan
Liangyu Xu
Yongwei Zhou
Sirui Wang
Rongxiang Weng
Jiadong Wang
Xunliang Cai
65
0
0
08 Feb 2025
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Haneul Yoo
Cheonbok Park
Sangdoo Yun
Alice H. Oh
Hwaran Lee
34
3
0
04 Nov 2024
Exploring Curriculum Learning for Vision-Language Tasks: A Study on
  Small-Scale Multimodal Training
Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training
Rohan Saha
Abrar Fahim
Alona Fyshe
Alex Murphy
26
0
0
20 Oct 2024
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous
  Spatial Reasoning?
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning?
Bowen Zhao
Leo Parker Dirac
Paulina Varshavskaya
VLM
LRM
28
0
0
25 Sep 2024
How transformers learn structured data: insights from hierarchical
  filtering
How transformers learn structured data: insights from hierarchical filtering
Jerome Garnier-Brun
Marc Mézard
Emanuele Moscato
Luca Saglietti
37
5
0
27 Aug 2024
Curriculum Learning for Small Code Language Models
Curriculum Learning for Small Code Language Models
Marwa Nair
K. Yamani
Lynda Said Lhadj
Riyadh Baghdadi
32
4
0
14 Jul 2024
Interpretability of Language Models via Task Spaces
Interpretability of Language Models via Task Spaces
Lucas Weber
Jaap Jumelet
Elia Bruni
Dieuwke Hupkes
37
4
0
10 Jun 2024
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation
JoonHo Lee
Jae Oh Woo
Juree Seok
Parisa Hassanzadeh
Wooseok Jang
...
Hankyu Moon
Wenjun Hu
Yeong-Dae Kwon
Taehee Lee
Seungjai Min
47
2
0
10 May 2024
CLIMB: Curriculum Learning for Infant-inspired Model Building
CLIMB: Curriculum Learning for Infant-inspired Model Building
Richard Diehl Martinez
Zébulon Goriely
Hope McGovern
Christopher Davis
Andrew Caines
P. Buttery
Lisa Beinborn
32
10
0
15 Nov 2023
On the effect of curriculum learning with developmental data for grammar
  acquisition
On the effect of curriculum learning with developmental data for grammar acquisition
Mattia Opper
J. Morrison
N. Siddharth
23
2
0
31 Oct 2023
Curriculum Learning with Adam: The Devil Is in the Wrong Details
Curriculum Learning with Adam: The Devil Is in the Wrong Details
Leon Weber
Jaap Jumelet
Paul Michel
Elia Bruni
Dieuwke Hupkes
ODL
18
3
0
23 Aug 2023
How to Plant Trees in Language Models: Data and Architectural Effects on
  the Emergence of Syntactic Inductive Biases
How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases
Aaron Mueller
Tal Linzen
AI4CE
29
20
0
31 May 2023
A Mathematical Model for Curriculum Learning for Parities
A Mathematical Model for Curriculum Learning for Parities
Elisabetta Cornacchia
Elchanan Mossel
40
10
0
31 Jan 2023
Efficient Pre-training of Masked Language Model via Concept-based
  Curriculum Masking
Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking
Mingyu Lee
Jun-Hyung Park
Junho Kim
Kang-Min Kim
SangKeun Lee
15
12
0
15 Dec 2022
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and
  Training Efficiency via Efficient Data Sampling and Routing
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
Conglong Li
Z. Yao
Xiaoxia Wu
Minjia Zhang
Connor Holmes
Cheng Li
Yuxiong He
27
24
0
07 Dec 2022
Will we run out of data? Limits of LLM scaling based on human-generated
  data
Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos
A. Ho
J. Sevilla
T. Besiroglu
Lennart Heim
Marius Hobbhahn
ALM
33
111
0
26 Oct 2022
Prompt Injection: Parameterization of Fixed Inputs
Prompt Injection: Parameterization of Fixed Inputs
Eunbi Choi
Yongrae Jo
Joel Jang
Minjoon Seo
16
29
0
31 May 2022
Data Selection Curriculum for Neural Machine Translation
Data Selection Curriculum for Neural Machine Translation
Tasnim Mohiuddin
Philipp Koehn
Vishrav Chaudhary
James Cross
Shruti Bhosale
Shafiq R. Joty
32
11
0
25 Mar 2022
Development and Comparison of Scoring Functions in Curriculum Learning
Development and Comparison of Scoring Functions in Curriculum Learning
Himmet Toprak Kesgin
M. Fatih Amasyali
6
3
0
10 Feb 2022
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup
  for Training GPT Models
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models
Conglong Li
Minjia Zhang
Yuxiong He
15
37
0
13 Aug 2021
Curriculum Learning: A Survey
Curriculum Learning: A Survey
Petru Soviany
Radu Tudor Ionescu
Paolo Rota
N. Sebe
ODL
79
342
0
25 Jan 2021
Optimal Subarchitecture Extraction For BERT
Optimal Subarchitecture Extraction For BERT
Adrian de Wynter
Daniel J. Perry
MQ
43
18
0
20 Oct 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
261
4,489
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1