ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.09397
  4. Cited By
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive
  Bias to Sequence-to-sequence Models

Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

17 March 2022
Aaron Mueller
Robert Frank
Tal Linzen
Luheng Wang
Sebastian Schuster
    AIMat
ArXivPDFHTML

Papers citing "Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models"

29 / 29 papers shown
Title
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
152
120
0
10 Apr 2025
How Does Code Pretraining Affect Language Model Task Performance?
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
91
12
0
06 Sep 2024
Transformers Generalize Linearly
Transformers Generalize Linearly
Jackson Petty
Robert Frank
AI4CE
233
16
0
24 Sep 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language
  Models
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
102
110
0
10 Jun 2021
The Low-Dimensional Linear Geometry of Contextualized Word
  Representations
The Low-Dimensional Linear Geometry of Contextualized Word Representations
Evan Hernandez
Jacob Andreas
MILM
82
42
0
15 May 2021
Counterfactual Interventions Reveal the Causal Effect of Relative Clause
  Representations on Agreement Prediction
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction
Shauli Ravfogel
Grusha Prasad
Tal Linzen
Yoav Goldberg
53
59
0
14 May 2021
mT5: A massively multilingual pre-trained text-to-text transformer
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
118
2,533
0
22 Oct 2020
Can neural networks acquire a structural bias from raw linguistic data?
Can neural networks acquire a structural bias from raw linguistic data?
Alex Warstadt
Samuel R. Bowman
AI4CE
46
54
0
14 Jul 2020
Finding Universal Grammatical Relations in Multilingual BERT
Finding Universal Grammatical Relations in Multilingual BERT
Ethan A. Chi
John Hewitt
Christopher D. Manning
38
151
0
09 May 2020
A Systematic Assessment of Syntactic Generalization in Neural Language
  Models
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer Hu
Jon Gauthier
Peng Qian
Ethan Gotlieb Wilcox
R. Levy
ELM
69
220
0
07 May 2020
How Can We Accelerate Progress Towards Human-like Linguistic
  Generalization?
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
267
194
0
03 May 2020
Cross-Linguistic Syntactic Evaluation of Word Prediction Models
Cross-Linguistic Syntactic Evaluation of Word Prediction Models
Aaron Mueller
Garrett Nicolai
Panayiota Petrou-Zeniou
N. Talmina
Tal Linzen
60
56
0
01 May 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
111
1,806
0
22 Jan 2020
Does syntax need to grow on trees? Sources of hierarchical inductive
  bias in sequence-to-sequence networks
Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks
R. Thomas McCoy
Robert Frank
Tal Linzen
73
108
0
10 Jan 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
381
20,053
0
23 Oct 2019
Quantity doesn't buy quality syntax with neural language models
Quantity doesn't buy quality syntax with neural language models
Marten van Schijndel
Aaron Mueller
Tal Linzen
59
68
0
31 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
536
24,351
0
26 Jul 2019
What Does BERT Look At? An Analysis of BERT's Attention
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
BERT Rediscovers the Classical NLP Pipeline
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
129
1,471
0
15 May 2019
Studying the Inductive Biases of RNNs with Synthetic Variations of
  Natural Languages
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Shauli Ravfogel
Yoav Goldberg
Tal Linzen
63
70
0
15 Mar 2019
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural
  Language Inference
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
129
1,234
0
04 Feb 2019
Assessing BERT's Syntactic Abilities
Assessing BERT's Syntactic Abilities
Yoav Goldberg
71
495
0
16 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.6K
94,511
0
11 Oct 2018
What do RNN Language Models Learn about Filler-Gap Dependencies?
What do RNN Language Models Learn about Filler-Gap Dependencies?
Ethan Gotlieb Wilcox
R. Levy
Takashi Morita
Richard Futrell
LRM
60
168
0
31 Aug 2018
Targeted Syntactic Evaluation of Language Models
Targeted Syntactic Evaluation of Language Models
Rebecca Marvin
Tal Linzen
70
415
0
27 Aug 2018
Revisiting the poverty of the stimulus: hierarchical generalization
  without a hierarchical bias in recurrent neural networks
Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks
R. Thomas McCoy
Robert Frank
Tal Linzen
76
81
0
25 Feb 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
654
130,942
0
12 Jun 2017
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Tal Linzen
Emmanuel Dupoux
Yoav Goldberg
101
903
0
04 Nov 2016
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
392
20,528
0
10 Sep 2014
1