v1v2 (latest)

BERT Rediscovers the Classical NLP Pipeline

15 May 2019

Papers citing "BERT Rediscovers the Classical NLP Pipeline"

50 / 821 papers shown

Title
It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning Alexey Tikhonov Max Ryabinin LRM 57 64 0 22 Jun 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning Colin Wei Sang Michael Xie Tengyu Ma 148 100 0 17 Jun 2021
Pre-Trained Models: Past, Present and Future Xu Han Zhengyan Zhang Ning Ding Yuxian Gu Xiao Liu ... Jie Tang Ji-Rong Wen Jinhui Yuan Wayne Xin Zhao Jun Zhu AIFin MQ AI4MH 177 863 0 14 Jun 2021
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models Matthew Finlayson Aaron Mueller Sebastian Gehrmann Stuart M. Shieber Tal Linzen Yonatan Belinkov 132 110 0 10 Jun 2021
Causal Abstractions of Neural Networks Atticus Geiger Hanson Lu Thomas Icard Christopher Potts NAI CML 80 246 0 06 Jun 2021
BERTnesia: Investigating the capture and forgetting of knowledge in BERT Jonas Wallat Jaspreet Singh Avishek Anand CLL KELM 148 60 0 05 Jun 2021
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes J. Mullenbach Yada Pruksachatkun Sean Adler Jennifer Seale Jordan Swartz T. McKelvey Hui Dai Yi Yang David Sontag 66 16 0 04 Jun 2021
Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization Yichen Jiang Asli Celikyilmaz P. Smolensky Paul Soulos Sudha Rao Hamid Palangi Roland Fernandez Caitlin Smith Joey Tianyi Zhou Jianfeng Gao 80 19 0 02 Jun 2021
John praised Mary because he? Implicit Causality Bias and Its Interaction with Explicit Cues in LMs Yova Kementchedjhieva Mark Anderson Anders Søgaard 45 13 0 02 Jun 2021
Implicit Representations of Meaning in Neural Language Models Belinda Z. Li Maxwell Nye Jacob Andreas NAI MILM 67 177 0 01 Jun 2021
Less is More: Pay Less Attention in Vision Transformers Zizheng Pan Bohan Zhuang Haoyu He Jing Liu Jianfei Cai ViT 141 87 0 29 May 2021
Diagnosing Transformers in Task-Oriented Semantic Parsing Shrey Desai Ahmed Aly 43 7 0 27 May 2021
Inspecting the concept knowledge graph encoded by modern language models Carlos Aspillaga Marcelo Mendoza Alvaro Soto 72 13 0 27 May 2021
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond Daniel Loureiro A. Jorge Jose Camacho-Collados 92 26 0 26 May 2021
Self-Attention Networks Can Process Bounded Hierarchical Languages Shunyu Yao Binghui Peng Christos H. Papadimitriou Karthik Narasimhan 89 83 0 24 May 2021
Unsupervised Speech Recognition Alexei Baevski Wei-Ning Hsu Alexis Conneau Michael Auli SSL 152 275 0 24 May 2021
A comparative evaluation and analysis of three generations of Distributional Semantic Models Alessandro Lenci Magnus Sahlgren Patrick Jeuniaux Amaru Cuba Gyllensten Martina Miliani 76 52 0 20 May 2021
Compositional Processing Emerges in Neural Networks Solving Math Problems Jacob Russin Roland Fernandez Hamid Palangi Eric Rosen Nebojsa Jojic P. Smolensky Jianfeng Gao 48 14 0 19 May 2021
Fine-grained Interpretation and Causation Analysis in Deep NLP Models Hassan Sajjad Narine Kokhlikyan Fahim Dalvi Nadir Durrani MILM 79 8 0 17 May 2021
How is BERT surprised? Layerwise detection of linguistic anomalies Bai Li Zining Zhu Guillaume Thomas Yang Xu Frank Rudzicz 76 31 0 16 May 2021
The Low-Dimensional Linear Geometry of Contextualized Word Representations Evan Hernandez Jacob Andreas MILM 106 45 0 15 May 2021
Go Beyond Plain Fine-tuning: Improving Pretrained Models for Social Commonsense Ting-Yun Chang Yang Liu Karthik Gopalakrishnan Behnam Hedayatnia Pei Zhou Dilek Z. Hakkani-Tür ReLM VLM AI4MH LRM 18 1 0 12 May 2021
Swarm Differential Privacy for Purpose Driven Data-Information-Knowledge-Wisdom Architecture Yingbo Li Yucong Duan Z. Maamar Haoyang Che Anamaria-Beatrice Spulber Stelios Fuentes 30 13 0 09 May 2021
FNet: Mixing Tokens with Fourier Transforms James Lee-Thorp Joshua Ainslie Ilya Eckstein Santiago Ontanon 145 536 0 09 May 2021
Understanding by Understanding Not: Modeling Negation in Language Models Arian Hosseini Siva Reddy Dzmitry Bahdanau R. Devon Hjelm Alessandro Sordoni Rameswar Panda 98 90 0 07 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality Adithya Ganesan Matthew Matero Aravind Reddy Ravula Huy-Hien Vu H. Andrew Schwartz 90 35 0 07 May 2021
Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach Buse Giledereli Mrinmaya Sachan 58 10 0 06 May 2021
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses Aina Garí Soler Marianna Apidianaki MILM 284 70 0 29 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers Vladislav Mikhailov O. Serikov Ekaterina Artemova 82 9 0 26 Apr 2021
Attention vs non-attention for a Shapley-based explanation method T. Kersten Hugh Mee Wong Jaap Jumelet Dieuwke Hupkes 81 4 0 26 Apr 2021
Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand? William Merrill Yoav Goldberg Roy Schwartz Noah A. Smith 110 69 0 22 Apr 2021
Knowledge Neurons in Pretrained Transformers Damai Dai Li Dong Y. Hao Zhifang Sui Baobao Chang Furu Wei KELM MU 161 466 0 18 Apr 2021
A multilabel approach to morphosyntactic probing Naomi Tachikawa Shapiro Amandalynne Paullada Shane Steinert-Threlkeld 75 10 0 17 Apr 2021
Moving on from OntoNotes: Coreference Resolution Model Transfer Patrick Xia Benjamin Van Durme 86 30 0 17 Apr 2021
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models Zhengxuan Wu Nelson F. Liu Christopher Potts 37 3 0 17 Apr 2021
Memorisation versus Generalisation in Pre-trained Language Models Michael Tänzer Sebastian Ruder Marek Rei 112 51 0 16 Apr 2021
MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning Mengzhou Xia Guoqing Zheng Subhabrata Mukherjee Milad Shokouhi Graham Neubig Ahmed Hassan Awadallah 82 32 0 16 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models Matteo Alleman J. Mamou Miguel Rio Hanlin Tang Yoon Kim SueYeon Chung NAI 100 17 0 15 Apr 2021
Effect of Post-processing on Contextualized Word Representations Hassan Sajjad Firoj Alam Fahim Dalvi Nadir Durrani 61 9 0 15 Apr 2021
Disentangling Representations of Text by Masking Transformers Xiongyi Zhang Jan-Willem van de Meent Byron C. Wallace DRL 64 21 0 14 Apr 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little Koustuv Sinha Robin Jia Dieuwke Hupkes J. Pineau Adina Williams Douwe Kiela 137 249 0 14 Apr 2021
Mediators in Determining what Processing BERT Performs First Aviv Slobodkin Leshem Choshen Omri Abend MoE 136 15 0 13 Apr 2021
On the Impact of Knowledge-based Linguistic Annotations in the Quality of Scientific Embeddings Andrés García-Silva R. Denaux José Manuél Gómez-Pérez 120 3 0 13 Apr 2021
Understanding Transformers for Bot Detection in Twitter Andrés García-Silva Cristian Berrío José Manuél Gómez-Pérez 44 4 0 13 Apr 2021
What's in your Head? Emergent Behaviour in Multi-Task Transformer Models Mor Geva Uri Katz Aviv Ben-Arie Jonathan Berant LRM 81 11 0 13 Apr 2021
DirectProbe: Studying Representations without Classifiers Yichu Zhou Vivek Srikumar 97 29 0 13 Apr 2021
Evaluating Saliency Methods for Neural Language Models Shuoyang Ding Philipp Koehn FAtt XAI 63 55 0 12 Apr 2021
Does My Representation Capture X? Probe-Ably Deborah Ferreira Julia Rozanova Mokanarangan Thayaparan Marco Valentino André Freitas 40 12 0 12 Apr 2021
Joint Universal Syntactic and Semantic Parsing Elias Stengel-Eskin Kenton W. Murray Sheng Zhang Aaron Steven White Benjamin Van Durme 67 9 0 12 Apr 2021
Low-Complexity Probing via Finding Subnetworks Steven Cao Victor Sanh Alexander M. Rush 69 54 0 08 Apr 2021