State Spaces Aren't Enough: Machine Translation Needs Attention

25 April 2023

Papers citing "State Spaces Aren't Enough: Machine Translation Needs Attention"

8 / 8 papers shown

Title
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination Jerry Huang Prasanna Parthasarathi Mehdi Rezagholizadeh Boxing Chen Sarath Chandar 53 0 0 22 Oct 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities Jerry Huang 57 7 0 11 Jul 2024
How Effective are State Space Models for Machine Translation? Hugo Pitorro Pavlo Vasylenko Marcos Vinícius Treviso André F. T. Martins Mamba 45 2 0 07 Jul 2024
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Yaroslav Aksenov Nikita Balagansky Sofia Maria Lo Cicero Vaina Boris Shaposhnikov Alexey Gorbatovski Daniil Gavrilov KELM 33 5 0 16 Feb 2024
Learning Long Sequences in Spiking Neural Networks Matei Ioan Stan Oliver Rhodes 37 11 0 14 Dec 2023
Sparse Modular Activation for Efficient Sequence Modeling Liliang Ren Yang Liu Shuohang Wang Yichong Xu Chenguang Zhu Chengxiang Zhai 43 13 0 19 Jun 2023
Focus Your Attention (with Adaptive IIR Filters) Shahar Lutati Itamar Zimerman Lior Wolf 32 9 0 24 May 2023
Knowledge Relation Rank Enhanced Heterogeneous Learning Interaction Modeling for Neural Graph Forgetting Knowledge Tracing Linqing Li Zhifeng Wang 24 11 0 08 Apr 2023