ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.06349
  4. Cited By
Resurrecting Recurrent Neural Networks for Long Sequences

Resurrecting Recurrent Neural Networks for Long Sequences

11 March 2023
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
ArXivPDFHTML

Papers citing "Resurrecting Recurrent Neural Networks for Long Sequences"

48 / 48 papers shown
Title
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
Xuelong Li
Guangliang Cheng
Mamba
252
0
0
01 May 2025
RWKV-X: A Linear Complexity Hybrid Language Model
RWKV-X: A Linear Complexity Hybrid Language Model
Haowen Hou
Zhiyi Huang
Kaifeng Tan
Rongchang Lu
Fei Richard Yu
VLM
136
0
0
30 Apr 2025
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Patrick Haller
Jonas Golde
Alan Akbik
94
0
0
19 Apr 2025
Leveraging State Space Models in Long Range Genomics
Leveraging State Space Models in Long Range Genomics
Matvei Popov
Aymen Kallala
Anirudha Ramesh
Narimane Hennouni
Shivesh Khaitan
Rick Gentry
Alain-Sam Cohen
Mamba
110
0
0
07 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
97
0
0
29 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
125
3
0
18 Mar 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu Cheng
KELM
128
4
0
19 Feb 2025
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
Alvaro Arroyo
Alessio Gravina
Benjamin Gutteridge
Federico Barbero
Claudio Gallicchio
Xiaowen Dong
Michael M. Bronstein
P. Vandergheynst
111
10
0
15 Feb 2025
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
HadamRNN: Binary and Sparse Ternary Orthogonal RNNs
Armand Foucault
Franck Mamalet
François Malgouyres
MQ
229
0
0
28 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
219
10
0
17 Jan 2025
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba
Wall Kim
Mamba
96
0
0
10 Jan 2025
OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature Segmentation
OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature Segmentation
Shun Zou
Zhuo Zhang
Guangwei Gao
Mamba
166
1
0
03 Jan 2025
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Elvis Nunez
Luca Zancato
Benjamin Bowman
Aditya Golatkar
Wei Xia
Stefano Soatto
170
4
0
17 Dec 2024
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
Günter Klambauer
Razvan Pascanu
Sepp Hochreiter
193
6
0
29 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
125
0
0
22 Oct 2024
State-space models can learn in-context by gradient descent
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
97
3
0
15 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
Melanie Zeilinger
Carmen Amo Alonso
172
0
0
14 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
158
0
0
09 Oct 2024
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik
Jason Liu
Claire Wang
Amish Sethi
Saikat Dutta
Mayur Naik
Eric Wong
87
2
0
04 Oct 2024
Oscillatory State-Space Models
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
390
8
0
04 Oct 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
99
7
0
11 Jul 2024
Inferring stochastic low-rank recurrent neural networks from neural data
Inferring stochastic low-rank recurrent neural networks from neural data
Matthijs Pals
A Erdem Sağtekin
Felix Pei
Manuel Gloeckler
Jakob H Macke
480
7
0
24 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
138
66
0
11 Jun 2024
Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints
Length independent generalization bounds for deep SSM architectures via Rademacher contraction and stability constraints
Dániel Rácz
Mihaly Petreczky
Bálint Daróczy
90
1
0
30 May 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
93
1
0
01 Feb 2024
Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks
Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks
Zihao Wang
Zhe Wu
70
3
0
15 Jan 2024
Real-Time Recurrent Reinforcement Learning
Real-Time Recurrent Reinforcement Learning
Julian Lemmel
Radu Grosu
71
2
0
08 Nov 2023
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback
Gated Recurrent Neural Networks with Weighted Time-Delay Feedback
N. Benjamin Erichson
Soon Hoe Lim
Michael W. Mahoney
63
6
0
01 Dec 2022
What Makes Convolutional Models Great on Long Sequence Modeling?
What Makes Convolutional Models Great on Long Sequence Modeling?
Yuhong Li
Tianle Cai
Yi Zhang
De-huai Chen
Debadeepta Dey
VLM
67
96
0
17 Oct 2022
Liquid Structural State-Space Models
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
136
101
0
26 Sep 2022
Long Range Language Modeling via Gated State Spaces
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
89
241
0
27 Jun 2022
How to Train Your HiPPO: State Space Models with Generalized Orthogonal
  Basis Projections
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Albert Gu
Isys Johnson
Aman Timalsina
Atri Rudra
Christopher Ré
Mamba
157
97
0
24 Jun 2022
On the Parameterization and Initialization of Diagonal State Space
  Models
On the Parameterization and Initialization of Diagonal State Space Models
Albert Gu
Ankit Gupta
Karan Goel
Christopher Ré
81
315
0
23 Jun 2022
FiLM: Frequency improved Legendre Memory Model for Long-term Time Series
  Forecasting
FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting
Tian Zhou
Ziqing Ma
Xue Wang
Qingsong Wen
Liang Sun
Tao Yao
Wotao Yin
Rong Jin
AI4TS
153
180
0
18 May 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
114
305
0
27 Mar 2022
It's Raw! Audio Generation with State-Space Models
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
55
191
0
20 Feb 2022
Long Range Arena: A Benchmark for Efficient Transformers
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
136
718
0
08 Nov 2020
HiPPO: Recurrent Memory with Optimal Polynomial Projections
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu
Tri Dao
Stefano Ermon
Atri Rudra
Christopher Ré
110
517
0
17 Aug 2020
Unbiased Online Recurrent Optimization
Unbiased Online Recurrent Optimization
Corentin Tallec
Yann Ollivier
67
98
0
16 Feb 2017
Language Modeling with Gated Convolutional Networks
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
237
2,397
0
23 Dec 2016
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using
  Householder Reflections
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
Zakaria Mhammedi
Andrew D. Hellicar
Ashfaqur Rahman
James Bailey
82
129
0
01 Dec 2016
Gradient Descent Learns Linear Dynamical Systems
Gradient Descent Learns Linear Dynamical Systems
Moritz Hardt
Tengyu Ma
Benjamin Recht
105
240
0
16 Sep 2016
WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
404
7,391
0
12 Sep 2016
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
410
10,482
0
21 Jul 2016
Unitary Evolution Recurrent Neural Networks
Unitary Evolution Recurrent Neural Networks
Martín Arjovsky
Amar Shah
Yoshua Bengio
ODL
75
770
0
20 Nov 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
463
43,289
0
11 Feb 2015
On the Properties of Neural Machine Translation: Encoder-Decoder
  Approaches
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
Kyunghyun Cho
B. V. Merrienboer
Dzmitry Bahdanau
Yoshua Bengio
AI4CE
AIMat
244
6,776
0
03 Sep 2014
Learning Phrase Representations using RNN Encoder-Decoder for
  Statistical Machine Translation
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Kyunghyun Cho
B. V. Merrienboer
Çağlar Gülçehre
Dzmitry Bahdanau
Fethi Bougares
Holger Schwenk
Yoshua Bengio
AIMat
1.0K
23,344
0
03 Jun 2014
1