486

Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

Abstract

While machine learning is currently very successful in several application domains, we are still very far from achieving a real artificial intelligence. In this paper, we study basic sequence prediction problems that are beyond the scope of what is learnable with popular methods such as recurrent networks. We show that simple algorithms can be learned from sequential data with a recurrent network associated with trainable stacks. We focus our study on algorithmically generated sequences such as anbna^n b^{n}, that can only be learned by models which have the capacity to count. Once trained, we show that our method is able generalize to sequences up to an arbitrary size. We discuss the limitations of standard machine learning approaches to learn algorithmic regularities of this type. We propose directions to overcome these shortcomings, such as using search based optimization.

View on arXiv
Comments on this paper