13
6

Sequential Attention

Abstract

In this paper we propose a neural network model with a novel Sequential Attention layer that extends soft attention by assigning weights to words in an input sequence in a way that takes into account not just how well that word matches a query, but how well surrounding words match. We evaluate this approach on the task of reading comprehension (Who did What and CNN datasets) and show that it dramatically improves a strong baseline like the Stanford Reader. The resulting model is competitive with the state of the art.

View on arXiv
Comments on this paper