Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.09397
Cited By
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models
17 March 2022
Aaron Mueller
Robert Frank
Tal Linzen
Luheng Wang
Sebastian Schuster
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models"
29 / 29 papers shown
Title
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
152
120
0
10 Apr 2025
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
91
12
0
06 Sep 2024
Transformers Generalize Linearly
Jackson Petty
Robert Frank
AI4CE
233
16
0
24 Sep 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
102
110
0
10 Jun 2021
The Low-Dimensional Linear Geometry of Contextualized Word Representations
Evan Hernandez
Jacob Andreas
MILM
82
42
0
15 May 2021
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction
Shauli Ravfogel
Grusha Prasad
Tal Linzen
Yoav Goldberg
53
59
0
14 May 2021
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
118
2,533
0
22 Oct 2020
Can neural networks acquire a structural bias from raw linguistic data?
Alex Warstadt
Samuel R. Bowman
AI4CE
46
54
0
14 Jul 2020
Finding Universal Grammatical Relations in Multilingual BERT
Ethan A. Chi
John Hewitt
Christopher D. Manning
38
151
0
09 May 2020
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer Hu
Jon Gauthier
Peng Qian
Ethan Gotlieb Wilcox
R. Levy
ELM
69
220
0
07 May 2020
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
267
194
0
03 May 2020
Cross-Linguistic Syntactic Evaluation of Word Prediction Models
Aaron Mueller
Garrett Nicolai
Panayiota Petrou-Zeniou
N. Talmina
Tal Linzen
60
56
0
01 May 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
111
1,806
0
22 Jan 2020
Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks
R. Thomas McCoy
Robert Frank
Tal Linzen
73
108
0
10 Jan 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
381
20,053
0
23 Oct 2019
Quantity doesn't buy quality syntax with neural language models
Marten van Schijndel
Aaron Mueller
Tal Linzen
59
68
0
31 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
536
24,351
0
26 Jul 2019
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
129
1,471
0
15 May 2019
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Shauli Ravfogel
Yoav Goldberg
Tal Linzen
63
70
0
15 Mar 2019
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
129
1,234
0
04 Feb 2019
Assessing BERT's Syntactic Abilities
Yoav Goldberg
71
495
0
16 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.6K
94,511
0
11 Oct 2018
What do RNN Language Models Learn about Filler-Gap Dependencies?
Ethan Gotlieb Wilcox
R. Levy
Takashi Morita
Richard Futrell
LRM
60
168
0
31 Aug 2018
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
70
415
0
27 Aug 2018
Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks
R. Thomas McCoy
Robert Frank
Tal Linzen
76
81
0
25 Feb 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
654
130,942
0
12 Jun 2017
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Tal Linzen
Emmanuel Dupoux
Yoav Goldberg
101
903
0
04 Nov 2016
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
392
20,528
0
10 Sep 2014
1