Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations

13 March 2025

Papers citing "Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations"

20 / 20 papers shown

Title
Towards Scalable and Stable Parallelization of Nonlinear RNNs Xavier Gonzalez Andrew Warrington Jimmy T.H. Smith Scott W. Linderman 182 10 0 17 Jan 2025
VMamba: Visual State Space Model Yue Liu Yunjie Tian Yuzhong Zhao Hongtian Yu Lingxi Xie Yaowei Wang Qixiang Ye Jianbin Jiao Yunfan Liu Mamba 234 675 0 31 Dec 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Riccardo Grazzi Julien N. Siems Jörg Franke Arber Zela Frank Hutter Massimiliano Pontil 131 22 0 19 Nov 2024
Artificial Kuramoto Oscillatory Neurons Takeru Miyato Sindy Löwe Andreas Geiger Max Welling AI4CE 155 9 0 17 Oct 2024
An Empirical Study of Mamba-based Language Models R. Waleffe Wonmin Byeon Duncan Riach Brandon Norick V. Korthikanti ... Vartika Singh Jared Casper Jan Kautz Mohammad Shoeybi Bryan Catanzaro 98 72 0 12 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the end of the story Nicolas Zucchet Antonio Orvieto ODL AAML 65 15 0 31 May 2024
The Illusion of State in State-Space Models William Merrill Jackson Petty Ashish Sabharwal 64 54 0 12 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence Bo Peng Daniel Goldstein Quentin G. Anthony Alon Albalak Eric Alcaide ... Bingchen Zhao Qihang Zhao Peng Zhou Jian Zhu Ruijie Zhu 65 78 0 08 Apr 2024
Theoretical Foundations of Deep Selective State-Space Models Nicola Muca Cirone Antonio Orvieto Benjamin Walker C. Salvi Terry Lyons Mamba 152 31 0 29 Feb 2024
It's Raw! Audio Generation with State-Space Models Karan Goel Albert Gu Chris Donahue Christopher Ré 51 190 0 20 Feb 2022
Stabilizing Equilibrium Models by Jacobian Regularization Shaojie Bai V. Koltun J. Zico Kolter 57 57 0 28 Jun 2021
Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks Avi Schwarzschild Eitan Borgnia Arjun Gupta Furong Huang U. Vishkin Micah Goldblum Tom Goldstein 56 75 0 08 Jun 2021
Linear Transformers Are Secretly Fast Weight Programmers Imanol Schlag Kazuki Irie Jürgen Schmidhuber 94 245 0 22 Feb 2021
Long Range Arena: A Benchmark for Efficient Transformers Yi Tay Mostafa Dehghani Samira Abnar Songlin Yang Dara Bahri Philip Pham J. Rao Liu Yang Sebastian Ruder Donald Metzler 128 717 0 08 Nov 2020
Rethinking Attention with Performers K. Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane ... Afroz Mohiuddin Lukasz Kaiser David Belanger Lucy J. Colwell Adrian Weller 163 1,570 0 30 Sep 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention Angelos Katharopoulos Apoorv Vyas Nikolaos Pappas Franccois Fleuret 166 1,755 0 29 Jun 2020
Linformer: Self-Attention with Linear Complexity Sinong Wang Belinda Z. Li Madian Khabsa Han Fang Hao Ma 183 1,694 0 08 Jun 2020
Implicit Deep Learning L. Ghaoui Fangda Gu Bertrand Travacca Armin Askari Alicia Y. Tsai AI4CE 54 178 0 17 Aug 2019
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 591 130,942 0 12 Jun 2017
Adaptive Computation Time for Recurrent Neural Networks Alex Graves 78 545 0 29 Mar 2016