v1v2 (latest)

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

27 January 2021

Papers citing "On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations"

36 / 36 papers shown

Title
Attention Can Reflect Syntactic Structure (If You Let It) Vinit Ravishankar Artur Kulmizev Mostafa Abdou Anders Søgaard Joakim Nivre 66 32 0 26 Jan 2021
Analyzing Individual Neurons in Pre-trained Language Models Nadir Durrani Hassan Sajjad Fahim Dalvi Yonatan Belinkov MILM 57 104 0 06 Oct 2020
Do Syntax Trees Help Pre-trained Transformers Extract Information? Devendra Singh Sachan Yuhao Zhang Peng Qi William L. Hamilton 50 79 0 20 Aug 2020
Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals Yanai Elazar Shauli Ravfogel Alon Jacovi Yoav Goldberg 62 25 0 01 Jun 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders A. Kuncoro Lingpeng Kong Daniel Fried Dani Yogatama Laura Rimell Chris Dyer Phil Blunsom 73 34 0 27 May 2020
The Unstoppable Rise of Computational Linguistics in Deep Learning James Henderson AI4CE 61 28 0 13 May 2020
Finding Universal Grammatical Relations in Multilingual BERT Ethan A. Chi John Hewitt Christopher D. Manning 47 151 0 09 May 2020
Universal Dependencies according to BERT: both more specific and more general Tomasz Limisiewicz Rudolf Rosa David Marevcek 58 18 0 30 Apr 2020
What Happens To BERT Embeddings During Fine-tuning? Amil Merchant Elahe Rahimtoroghi Ellie Pavlick Ian Tenney 72 187 0 29 Apr 2020
Do Neural Language Models Show Preferences for Syntactic Formalisms? Artur Kulmizev Vinit Ravishankar Mostafa Abdou Joakim Nivre MILM 50 43 0 29 Apr 2020
Syntactic Structure from Deep Learning Tal Linzen Marco Baroni NAI 58 185 0 22 Apr 2020
Information-Theoretic Probing with Minimum Description Length Elena Voita Ivan Titov 85 276 0 27 Mar 2020
A Primer in BERTology: What we know about how BERT works Anna Rogers Olga Kovaleva Anna Rumshisky OffRL 87 1,497 0 27 Feb 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping Jesse Dodge Gabriel Ilharco Roy Schwartz Ali Farhadi Hannaneh Hajishirzi Noah A. Smith 99 597 0 15 Feb 2020
Parsing as Pretraining David Vilares Michalina Strzyz Anders Søgaard Carlos Gómez-Rodríguez 67 31 0 05 Feb 2020
Linking artificial and human neural representations of language Jiajun Liu Roger Levy AI4CE 42 89 0 02 Oct 2019
Revealing the Dark Secrets of BERT Olga Kovaleva Alexey Romanov Anna Rogers Anna Rumshisky 38 554 0 21 Aug 2019
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models Allyson Ettinger 86 607 0 31 Jul 2019
Open Sesame: Getting Inside BERT's Linguistic Knowledge Yongjie Lin Y. Tan Robert Frank 55 287 0 04 Jun 2019
How to best use Syntax in Semantic Role Labelling Yufei Wang Mark Johnson Stephen Wan Yifang Sun Wei Wang 39 26 0 01 Jun 2019
What do you learn from context? Probing for sentence structure in contextualized word representations Ian Tenney Patrick Xia Berlin Chen Alex Jinpeng Wang Adam Poliak ... Najoung Kim Benjamin Van Durme Samuel R. Bowman Dipanjan Das Ellie Pavlick 180 861 0 15 May 2019
BERT Rediscovers the Classical NLP Pipeline Ian Tenney Dipanjan Das Ellie Pavlick MILM SSeg 138 1,476 0 15 May 2019
Simple BERT Models for Relation Extraction and Semantic Role Labeling Peng Shi Jimmy J. Lin VLM 68 446 0 10 Apr 2019
Linguistic Knowledge and Transferability of Contextual Representations Nelson F. Liu Matt Gardner Yonatan Belinkov Matthew E. Peters Noah A. Smith 135 735 0 21 Mar 2019
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks Matthew E. Peters Sebastian Ruder Noah A. Smith 79 437 0 14 Mar 2019
Assessing BERT's Syntactic Abilities Yoav Goldberg 73 496 0 16 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,114 0 11 Oct 2018
Know What You Don't Know: Unanswerable Questions for SQuAD Pranav Rajpurkar Robin Jia Percy Liang RALM ELM 290 2,845 0 11 Jun 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 349 895 0 03 May 2018
Linguistically-Informed Self-Attention for Semantic Role Labeling Emma Strubell Pat Verga D. Andor David J. Weiss Andrew McCallum OffRL 85 380 0 23 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 1.1K 7,182 0 20 Apr 2018
AllenNLP: A Deep Semantic Natural Language Processing Platform Matt Gardner Joel Grus Mark Neumann Oyvind Tafjord Pradeep Dasigi Nelson F. Liu Matthew E. Peters Michael Schmitz Luke Zettlemoyer VLM 88 1,282 0 20 Mar 2018
Deep contextualized word representations Matthew E. Peters Mark Neumann Mohit Iyyer Matt Gardner Christopher Clark Kenton Lee Luke Zettlemoyer NAI 224 11,565 0 15 Feb 2018
What do Neural Machine Translation Models Learn about Morphology? Yonatan Belinkov Nadir Durrani Fahim Dalvi Hassan Sajjad James R. Glass 103 414 0 11 Apr 2017
Deep Biaffine Attention for Neural Dependency Parsing Timothy Dozat Christopher D. Manning 113 1,223 0 06 Nov 2016
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks Yossi Adi Einat Kermany Yonatan Belinkov Ofer Lavi Yoav Goldberg 76 546 0 15 Aug 2016