Knowledge Neurons in Pretrained Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2021

18 April 2021

Damai Dai

Li Dong

Y. Hao

Zhifang Sui

Furu Wei

KELM

ArXiv (abs)PDF HTML Github (168★)

Abstract

Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus. In this paper, we explore how implicit knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. Given a relational fact, we propose a knowledge attribution method to identify the neurons that express the fact. We present that the activation of such knowledge neurons is highly correlated to the expression of their corresponding facts. In addition, even without fine-tuning, we can leverage knowledge neurons to explicitly edit (such as update, and erase) specific factual knowledge for pretrained Transformers.

View on arXiv

Comments on this paper