In this work, we cast abstractive text summarization as a sequence-to-sequence problem and employ the framework of Attentional Encoder-Decoder Recurrent Neural Networks to this problem, outperforming state-of-the art model of Rush et. al. (2015) on two different corpora. We also move beyond the basic architecture, and propose several novel models to address important problems in summarization including modeling key-words, capturing the hierarchy of sentence-to-word structure and addressing the problem of words that are key to a document, but rare elsewhere. Our work shows that many of our proposed solutions contribute to further improvement in performance. In addition, we propose a new dataset consisting of multi-sentence summaries, and establish performance benchmarks for further research.
View on arXiv