Turning transformer attention weights into zero-shot sequence labelers

Workshop on Representation Learning for NLP (RepL4NLP), 2021

26 March 2021

ArXiv (abs)PDF HTML Github

Main:5 Pages

6 Figures

Bibliography:2 Pages

10 Tables

Appendix:4 Pages

Abstract

We demonstrate how transformer-based models can be redesigned in order to capture inductive biases across tasks on different granularities and perform inference in a zero-shot manner. Specifically, we show how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision. We compare against a range of diverse and previously proposed methods for generating token-level labels, and present a simple yet effective modified attention layer that significantly advances the current state of the art.

View on arXiv

Comments on this paper