A Transformer with Stack Attention

A Transformer with Stack Attention

7 May 2024

Jennifer C. White

Mrinmaya Sachan

Papers citing "A Transformer with Stack Attention"

13 / 13 papers shown

Title
Investigating Recurrent Transformers with Dynamic Halt Jishnu Ray Chowdhury Cornelia Caragea 60 1 0 01 Feb 2024
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective Guhao Feng Bohang Zhang Yuntian Gu Haotian Ye Di He Liwei Wang LRM 64 235 0 24 May 2023
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity Sophie Hao Dana Angluin Robert Frank 28 75 0 13 Apr 2022
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale Laurent Sartran Samuel Barrett A. Kuncoro Milovs Stanojević Phil Blunsom Chris Dyer 59 50 0 01 Mar 2022
Datasets: A Community Library for Natural Language Processing Quentin Lhoest Albert Villanova del Moral Yacine Jernite A. Thakur Patrick von Platen ... Thibault Goehringer Victor Mustar François Lagunas Alexander M. Rush Thomas Wolf 110 596 0 07 Sep 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding Jianlin Su Yu Lu Shengfeng Pan Ahmed Murtadha Bo Wen Yunfeng Liu 126 2,307 0 20 Apr 2021
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 377 24,160 0 26 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models Michael Hahn 41 266 0 16 Jun 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Zihang Dai Zhilin Yang Yiming Yang J. Carbonell Quoc V. Le Ruslan Salakhutdinov VLM 123 3,707 0 09 Jan 2019
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 224 10,412 0 21 Jul 2016
A Decomposable Attention Model for Natural Language Inference Ankur P. Parikh Oscar Täckström Dipanjan Das Jakob Uszkoreit 304 1,369 0 06 Jun 2016
Recurrent Neural Network Grammars Chris Dyer A. Kuncoro Miguel Ballesteros Noah A. Smith GNN 56 524 0 25 Feb 2016
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 364 27,205 0 01 Sep 2014