Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.04515
Cited By
A Transformer with Stack Attention
7 May 2024
Jiaoda Li
Jennifer C. White
Mrinmaya Sachan
Ryan Cotterell
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Transformer with Stack Attention"
13 / 13 papers shown
Title
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
60
1
0
01 Feb 2024
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective
Guhao Feng
Bohang Zhang
Yuntian Gu
Haotian Ye
Di He
Liwei Wang
LRM
64
235
0
24 May 2023
Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity
Sophie Hao
Dana Angluin
Robert Frank
28
75
0
13 Apr 2022
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale
Laurent Sartran
Samuel Barrett
A. Kuncoro
Milovs Stanojević
Phil Blunsom
Chris Dyer
59
50
0
01 Mar 2022
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
110
596
0
07 Sep 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
126
2,307
0
20 Apr 2021
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
377
24,160
0
26 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models
Michael Hahn
41
266
0
16 Jun 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
123
3,707
0
09 Jan 2019
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
224
10,412
0
21 Jul 2016
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
304
1,369
0
06 Jun 2016
Recurrent Neural Network Grammars
Chris Dyer
A. Kuncoro
Miguel Ballesteros
Noah A. Smith
GNN
56
524
0
25 Feb 2016
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
364
27,205
0
01 Sep 2014
1