ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.05950
  4. Cited By
BERT Rediscovers the Classical NLP Pipeline
v1v2 (latest)

BERT Rediscovers the Classical NLP Pipeline

15 May 2019
Ian Tenney
Dipanjan Das
Ellie Pavlick
    MILMSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT Rediscovers the Classical NLP Pipeline"

50 / 821 papers shown
Title
It's All in the Heads: Using Attention Heads as a Baseline for
  Cross-Lingual Transfer in Commonsense Reasoning
It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning
Alexey Tikhonov
Max Ryabinin
LRM
57
64
0
22 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
  of Head and Prompt Tuning
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
148
100
0
17 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFinMQAI4MH
177
863
0
14 Jun 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language
  Models
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
132
110
0
10 Jun 2021
Causal Abstractions of Neural Networks
Causal Abstractions of Neural Networks
Atticus Geiger
Hanson Lu
Thomas Icard
Christopher Potts
NAICML
80
246
0
06 Jun 2021
BERTnesia: Investigating the capture and forgetting of knowledge in BERT
BERTnesia: Investigating the capture and forgetting of knowledge in BERT
Jonas Wallat
Jaspreet Singh
Avishek Anand
CLLKELM
148
60
0
05 Jun 2021
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital
  Discharge Notes
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes
J. Mullenbach
Yada Pruksachatkun
Sean Adler
Jennifer Seale
Jordan Swartz
T. McKelvey
Hui Dai
Yi Yang
David Sontag
66
16
0
04 Jun 2021
Enriching Transformers with Structured Tensor-Product Representations
  for Abstractive Summarization
Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization
Yichen Jiang
Asli Celikyilmaz
P. Smolensky
Paul Soulos
Sudha Rao
Hamid Palangi
Roland Fernandez
Caitlin Smith
Joey Tianyi Zhou
Jianfeng Gao
80
19
0
02 Jun 2021
John praised Mary because he? Implicit Causality Bias and Its
  Interaction with Explicit Cues in LMs
John praised Mary because he? Implicit Causality Bias and Its Interaction with Explicit Cues in LMs
Yova Kementchedjhieva
Mark Anderson
Anders Søgaard
45
13
0
02 Jun 2021
Implicit Representations of Meaning in Neural Language Models
Implicit Representations of Meaning in Neural Language Models
Belinda Z. Li
Maxwell Nye
Jacob Andreas
NAIMILM
67
177
0
01 Jun 2021
Less is More: Pay Less Attention in Vision Transformers
Less is More: Pay Less Attention in Vision Transformers
Zizheng Pan
Bohan Zhuang
Haoyu He
Jing Liu
Jianfei Cai
ViT
141
87
0
29 May 2021
Diagnosing Transformers in Task-Oriented Semantic Parsing
Diagnosing Transformers in Task-Oriented Semantic Parsing
Shrey Desai
Ahmed Aly
43
7
0
27 May 2021
Inspecting the concept knowledge graph encoded by modern language models
Inspecting the concept knowledge graph encoded by modern language models
Carlos Aspillaga
Marcelo Mendoza
Alvaro Soto
72
13
0
27 May 2021
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
  Beyond
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond
Daniel Loureiro
A. Jorge
Jose Camacho-Collados
92
26
0
26 May 2021
Self-Attention Networks Can Process Bounded Hierarchical Languages
Self-Attention Networks Can Process Bounded Hierarchical Languages
Shunyu Yao
Binghui Peng
Christos H. Papadimitriou
Karthik Narasimhan
89
83
0
24 May 2021
Unsupervised Speech Recognition
Unsupervised Speech Recognition
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
152
275
0
24 May 2021
A comparative evaluation and analysis of three generations of
  Distributional Semantic Models
A comparative evaluation and analysis of three generations of Distributional Semantic Models
Alessandro Lenci
Magnus Sahlgren
Patrick Jeuniaux
Amaru Cuba Gyllensten
Martina Miliani
76
52
0
20 May 2021
Compositional Processing Emerges in Neural Networks Solving Math
  Problems
Compositional Processing Emerges in Neural Networks Solving Math Problems
Jacob Russin
Roland Fernandez
Hamid Palangi
Eric Rosen
Nebojsa Jojic
P. Smolensky
Jianfeng Gao
48
14
0
19 May 2021
Fine-grained Interpretation and Causation Analysis in Deep NLP Models
Fine-grained Interpretation and Causation Analysis in Deep NLP Models
Hassan Sajjad
Narine Kokhlikyan
Fahim Dalvi
Nadir Durrani
MILM
79
8
0
17 May 2021
How is BERT surprised? Layerwise detection of linguistic anomalies
How is BERT surprised? Layerwise detection of linguistic anomalies
Bai Li
Zining Zhu
Guillaume Thomas
Yang Xu
Frank Rudzicz
76
31
0
16 May 2021
The Low-Dimensional Linear Geometry of Contextualized Word
  Representations
The Low-Dimensional Linear Geometry of Contextualized Word Representations
Evan Hernandez
Jacob Andreas
MILM
106
45
0
15 May 2021
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social
  Commonsense
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense
Ting-Yun Chang
Yang Liu
Karthik Gopalakrishnan
Behnam Hedayatnia
Pei Zhou
Dilek Z. Hakkani-Tür
ReLMVLMAI4MHLRM
18
1
0
12 May 2021
Swarm Differential Privacy for Purpose Driven
  Data-Information-Knowledge-Wisdom Architecture
Swarm Differential Privacy for Purpose Driven Data-Information-Knowledge-Wisdom Architecture
Yingbo Li
Yucong Duan
Z. Maamar
Haoyang Che
Anamaria-Beatrice Spulber
Stelios Fuentes
28
13
0
09 May 2021
FNet: Mixing Tokens with Fourier Transforms
FNet: Mixing Tokens with Fourier Transforms
James Lee-Thorp
Joshua Ainslie
Ilya Eckstein
Santiago Ontanon
145
536
0
09 May 2021
Understanding by Understanding Not: Modeling Negation in Language Models
Understanding by Understanding Not: Modeling Negation in Language Models
Arian Hosseini
Siva Reddy
Dzmitry Bahdanau
R. Devon Hjelm
Alessandro Sordoni
Rameswar Panda
98
90
0
07 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
90
35
0
07 May 2021
Bird's Eye: Probing for Linguistic Graph Structures with a Simple
  Information-Theoretic Approach
Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach
Buse Giledereli
Mrinmaya Sachan
58
10
0
06 May 2021
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and
  Partitionability into Senses
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
Aina Garí Soler
Marianna Apidianaki
MILM
284
70
0
29 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
82
9
0
26 Apr 2021
Attention vs non-attention for a Shapley-based explanation method
Attention vs non-attention for a Shapley-based explanation method
T. Kersten
Hugh Mee Wong
Jaap Jumelet
Dieuwke Hupkes
81
4
0
26 Apr 2021
Provable Limitations of Acquiring Meaning from Ungrounded Form: What
  Will Future Language Models Understand?
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?
William Merrill
Yoav Goldberg
Roy Schwartz
Noah A. Smith
110
69
0
22 Apr 2021
Knowledge Neurons in Pretrained Transformers
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELMMU
161
466
0
18 Apr 2021
A multilabel approach to morphosyntactic probing
A multilabel approach to morphosyntactic probing
Naomi Tachikawa Shapiro
Amandalynne Paullada
Shane Steinert-Threlkeld
75
10
0
17 Apr 2021
Moving on from OntoNotes: Coreference Resolution Model Transfer
Moving on from OntoNotes: Coreference Resolution Model Transfer
Patrick Xia
Benjamin Van Durme
86
30
0
17 Apr 2021
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained
  Models
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models
Zhengxuan Wu
Nelson F. Liu
Christopher Potts
37
3
0
17 Apr 2021
Memorisation versus Generalisation in Pre-trained Language Models
Memorisation versus Generalisation in Pre-trained Language Models
Michael Tänzer
Sebastian Ruder
Marek Rei
112
51
0
16 Apr 2021
MetaXL: Meta Representation Transformation for Low-resource
  Cross-lingual Learning
MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning
Mengzhou Xia
Guoqing Zheng
Subhabrata Mukherjee
Milad Shokouhi
Graham Neubig
Ahmed Hassan Awadallah
82
32
0
16 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of
  Hierarchical Phrase Structure in Pretrained Language Models
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
100
17
0
15 Apr 2021
Effect of Post-processing on Contextualized Word Representations
Effect of Post-processing on Contextualized Word Representations
Hassan Sajjad
Firoj Alam
Fahim Dalvi
Nadir Durrani
61
9
0
15 Apr 2021
Disentangling Representations of Text by Masking Transformers
Disentangling Representations of Text by Masking Transformers
Xiongyi Zhang
Jan-Willem van de Meent
Byron C. Wallace
DRL
64
21
0
14 Apr 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word
  Matters Pre-training for Little
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
Koustuv Sinha
Robin Jia
Dieuwke Hupkes
J. Pineau
Adina Williams
Douwe Kiela
137
249
0
14 Apr 2021
Mediators in Determining what Processing BERT Performs First
Mediators in Determining what Processing BERT Performs First
Aviv Slobodkin
Leshem Choshen
Omri Abend
MoE
136
15
0
13 Apr 2021
On the Impact of Knowledge-based Linguistic Annotations in the Quality
  of Scientific Embeddings
On the Impact of Knowledge-based Linguistic Annotations in the Quality of Scientific Embeddings
Andrés García-Silva
R. Denaux
José Manuél Gómez-Pérez
120
3
0
13 Apr 2021
Understanding Transformers for Bot Detection in Twitter
Understanding Transformers for Bot Detection in Twitter
Andrés García-Silva
Cristian Berrío
José Manuél Gómez-Pérez
44
4
0
13 Apr 2021
What's in your Head? Emergent Behaviour in Multi-Task Transformer Models
What's in your Head? Emergent Behaviour in Multi-Task Transformer Models
Mor Geva
Uri Katz
Aviv Ben-Arie
Jonathan Berant
LRM
81
11
0
13 Apr 2021
DirectProbe: Studying Representations without Classifiers
DirectProbe: Studying Representations without Classifiers
Yichu Zhou
Vivek Srikumar
97
29
0
13 Apr 2021
Evaluating Saliency Methods for Neural Language Models
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding
Philipp Koehn
FAttXAI
63
55
0
12 Apr 2021
Does My Representation Capture X? Probe-Ably
Does My Representation Capture X? Probe-Ably
Deborah Ferreira
Julia Rozanova
Mokanarangan Thayaparan
Marco Valentino
André Freitas
40
12
0
12 Apr 2021
Joint Universal Syntactic and Semantic Parsing
Joint Universal Syntactic and Semantic Parsing
Elias Stengel-Eskin
Kenton W. Murray
Sheng Zhang
Aaron Steven White
Benjamin Van Durme
67
9
0
12 Apr 2021
Low-Complexity Probing via Finding Subnetworks
Low-Complexity Probing via Finding Subnetworks
Steven Cao
Victor Sanh
Alexander M. Rush
69
54
0
08 Apr 2021
Previous
123...111213...151617
Next