Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.09397
Cited By
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models
17 March 2022
Aaron Mueller
Robert Frank
Tal Linzen
Luheng Wang
Sebastian Schuster
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models"
11 / 11 papers shown
Title
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
146
120
0
10 Apr 2025
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
91
12
0
06 Sep 2024
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction
Shauli Ravfogel
Grusha Prasad
Tal Linzen
Yoav Goldberg
53
59
0
14 May 2021
Can neural networks acquire a structural bias from raw linguistic data?
Alex Warstadt
Samuel R. Bowman
AI4CE
46
54
0
14 Jul 2020
Finding Universal Grammatical Relations in Multilingual BERT
Ethan A. Chi
John Hewitt
Christopher D. Manning
38
151
0
09 May 2020
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer Hu
Jon Gauthier
Peng Qian
Ethan Gotlieb Wilcox
R. Levy
ELM
69
220
0
07 May 2020
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
126
1,469
0
15 May 2019
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Shauli Ravfogel
Yoav Goldberg
Tal Linzen
63
70
0
15 Mar 2019
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
70
415
0
27 Aug 2018
Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks
R. Thomas McCoy
Robert Frank
Tal Linzen
76
81
0
25 Feb 2018
1