ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.08165
  4. Cited By
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

10 April 2025
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
Juan Ciro
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
ArXivPDFHTML

Papers citing "Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora"

23 / 73 papers shown
Title
Can training neural language models on a curriculum with developmentally
  plausible data improve alignment with human reading behavior?
Can training neural language models on a curriculum with developmentally plausible data improve alignment with human reading behavior?
Aryaman Chobey
Oliver Smith
Anzi Wang
Grusha Prasad
40
5
0
30 Nov 2023
Pre-training LLMs using human-like development data corpus
Pre-training LLMs using human-like development data corpus
Khushi Bhardwaj
Raj Sanjay Shah
Sashank Varma
32
6
0
08 Nov 2023
Large GPT-like Models are Bad Babies: A Closer Look at the Relationship
  between Linguistic Competence and Psycholinguistic Measures
Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures
Julius Steuer
Marius Mosbach
Dietrich Klakow
30
10
0
08 Nov 2023
Mini Minds: Exploring Bebeshka and Zlata Baby Models
Mini Minds: Exploring Bebeshka and Zlata Baby Models
Irina Proskurina
Guillaume Metzler
Julien Velcin
ALM
23
1
0
06 Nov 2023
Not all layers are equally as important: Every Layer Counts BERT
Not all layers are equally as important: Every Layer Counts BERT
Lucas Georges Gabriel Charpentier
David Samuel
20
15
0
03 Nov 2023
Too Much Information: Keeping Training Simple for BabyLMs
Too Much Information: Keeping Training Simple for BabyLMs
Lukas Edman
Lisa Bylinina
32
4
0
03 Nov 2023
On the effect of curriculum learning with developmental data for grammar
  acquisition
On the effect of curriculum learning with developmental data for grammar acquisition
Mattia Opper
J. Morrison
N. Siddharth
23
2
0
31 Oct 2023
Increasing The Performance of Cognitively Inspired Data-Efficient
  Language Models via Implicit Structure Building
Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building
Omar Momen
David Arps
Laura Kallmeyer
AI4CE
29
2
0
31 Oct 2023
Lil-Bevo: Explorations of Strategies for Training Language Models in
  More Humanlike Ways
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Venkata S Govindarajan
Juan Diego Rodriguez
Kaj Bostrom
Kyle Mahowald
23
1
0
26 Oct 2023
BabyStories: Can Reinforcement Learning Teach Baby Language Models to
  Write Better Stories?
BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?
Xingmeng Zhao
Tongnian Wang
Sheri Osborn
Anthony Rios
20
4
0
25 Oct 2023
ChapGTP, ILLC's Attempt at Raising a BabyLM: Improving Data Efficiency
  by Automatic Task Formation
ChapGTP, ILLC's Attempt at Raising a BabyLM: Improving Data Efficiency by Automatic Task Formation
Jaap Jumelet
Michael Hanna
Marianne de Heer Kloots
Anna Langedijk
Charlotte Pouw
Oskar van der Wal
29
3
0
17 Oct 2023
Stack Attention: Improving the Ability of Transformers to Model
  Hierarchical Patterns
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
Brian DuSell
David Chiang
28
12
0
03 Oct 2023
ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language
  Understanding
ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding
Omer Veysel Cagatan
24
2
0
30 Aug 2023
Baby Llama: knowledge distillation from an ensemble of teachers trained
  on a small dataset with no performance penalty
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
I. Timiryasov
J. Tastet
26
47
0
03 Aug 2023
Baby's CoThought: Leveraging Large Language Models for Enhanced
  Reasoning in Compact Models
Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models
Zheyu Zhang
Han Yang
Bolei Ma
David Rügamer
Ercong Nie
LRM
32
3
0
03 Aug 2023
SciMON: Scientific Inspiration Machines Optimized for Novelty
SciMON: Scientific Inspiration Machines Optimized for Novelty
Qingyun Wang
Doug Downey
Heng Ji
Tom Hope
LLMAG
37
62
0
23 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Rada Mihalcea
LRM
41
6
0
21 May 2023
Hatemoji: A Test Suite and Adversarially-Generated Dataset for
  Benchmarking and Detecting Emoji-based Hate
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate
Hannah Rose Kirk
B. Vidgen
Paul Röttger
Tristan Thrush
Scott A. Hale
67
57
0
12 Aug 2021
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
230
89
0
31 Dec 2020
DynaSent: A Dynamic Benchmark for Sentiment Analysis
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
230
77
0
30 Dec 2020
How Can We Accelerate Progress Towards Human-like Linguistic
  Generalization?
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
220
190
0
03 May 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,505
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
Previous
12