A Transformer with Stack Attention

A Transformer with Stack Attention

7 May 2024

Jennifer C. White

Mrinmaya Sachan

Papers citing "A Transformer with Stack Attention"

13 / 13 papers shown

Title
Investigating Recurrent Transformers with Dynamic Halt Jishnu Ray Chowdhury Cornelia Caragea 63 1 0 01 Feb 2024
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective Guhao Feng Bohang Zhang Yuntian Gu Haotian Ye Di He Liwei Wang LRM 64 235 0 24 May 2023
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity Sophie Hao Dana Angluin Robert Frank 28 75 0 13 Apr 2022
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale Laurent Sartran Samuel Barrett A. Kuncoro Milovs Stanojević Phil Blunsom Chris Dyer 59 50 0 01 Mar 2022
Datasets: A Community Library for Natural Language Processing Quentin Lhoest Albert Villanova del Moral Yacine Jernite A. Thakur Patrick von Platen ... Thibault Goehringer Victor Mustar François Lagunas Alexander M. Rush Thomas Wolf 127 596 0 07 Sep 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding Jianlin Su Yu Lu Shengfeng Pan Ahmed Murtadha Bo Wen Yunfeng Liu 135 2,307 0 20 Apr 2021
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 392 24,160 0 26 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models Michael Hahn 44 266 0 16 Jun 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Zihang Dai Zhilin Yang Yiming Yang J. Carbonell Quoc V. Le Ruslan Salakhutdinov VLM 133 3,707 0 09 Jan 2019
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 239 10,412 0 21 Jul 2016
A Decomposable Attention Model for Natural Language Inference Ankur P. Parikh Oscar Täckström Dipanjan Das Jakob Uszkoreit 304 1,369 0 06 Jun 2016
Recurrent Neural Network Grammars Chris Dyer A. Kuncoro Miguel Ballesteros Noah A. Smith GNN 61 524 0 25 Feb 2016
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 381 27,205 0 01 Sep 2014