ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.11174
  4. Cited By
Linear Transformers Are Secretly Fast Weight Programmers

Linear Transformers Are Secretly Fast Weight Programmers

22 February 2021
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
ArXivPDFHTML

Papers citing "Linear Transformers Are Secretly Fast Weight Programmers"

50 / 166 papers shown
Title
Contrastive Training of Complex-Valued Autoencoders for Object Discovery
Contrastive Training of Complex-Valued Autoencoders for Object Discovery
Aleksandar Stanić
Anand Gopalakrishnan
Kazuki Irie
Jürgen Schmidhuber
OCL
36
14
0
24 May 2023
Brain-inspired learning in artificial neural networks: a review
Brain-inspired learning in artificial neural networks: a review
Samuel Schmidgall
Jascha Achterberg
Thomas Miconi
Louis Kirsch
Rojin Ziaei
S. P. Hajiseyedrazi
Jason Eshraghian
31
52
0
18 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
L. Yu
Daniel Simig
Colin Flaherty
Armen Aghajanyan
Luke Zettlemoyer
M. Lewis
32
84
0
12 May 2023
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health
  Management: A Survey and Roadmaps
ChatGPT-Like Large-Scale Foundation Models for Prognostics and Health Management: A Survey and Roadmaps
Yanfang Li
Huan Wang
Muxia Sun
LM&MA
AI4TS
AI4CE
29
46
0
10 May 2023
Accelerating Neural Self-Improvement via Bootstrapping
Accelerating Neural Self-Improvement via Bootstrapping
Kazuki Irie
Jürgen Schmidhuber
29
1
0
02 May 2023
Meta-Learned Models of Cognition
Meta-Learned Models of Cognition
Marcel Binz
Ishita Dasgupta
Akshay K. Jagadish
M. Botvinick
Jane X. Wang
Eric Schulz
30
25
0
12 Apr 2023
POPGym: Benchmarking Partially Observable Reinforcement Learning
POPGym: Benchmarking Partially Observable Reinforcement Learning
Steven D. Morad
Ryan Kortvelesy
Matteo Bettini
Stephan Liwicki
Amanda Prorok
OffRL
19
37
0
03 Mar 2023
Permutation-Invariant Set Autoencoders with Fixed-Size Embeddings for
  Multi-Agent Learning
Permutation-Invariant Set Autoencoders with Fixed-Size Embeddings for Multi-Agent Learning
Ryan Kortvelesy
Steven D. Morad
Amanda Prorok
AI4CE
27
2
0
24 Feb 2023
Hyena Hierarchy: Towards Larger Convolutional Language Models
Hyena Hierarchy: Towards Larger Convolutional Language Models
Michael Poli
Stefano Massaroli
Eric Q. Nguyen
Daniel Y. Fu
Tri Dao
S. Baccus
Yoshua Bengio
Stefano Ermon
Christopher Ré
VLM
22
285
0
21 Feb 2023
Theory of coupled neuronal-synaptic dynamics
Theory of coupled neuronal-synaptic dynamics
David G. Clark
L. F. Abbott
24
18
0
17 Feb 2023
Self-Organising Neural Discrete Representation Learning à la Kohonen
Self-Organising Neural Discrete Representation Learning à la Kohonen
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
SSL
32
1
0
15 Feb 2023
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
34
18
0
09 Feb 2023
Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid
  Learning in RNNs
Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid Learning in RNNs
Y. Duan
Zhongfan Jia
Qian Li
Yi Zhong
Kaisheng Ma
AAML
30
2
0
07 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
39
9
0
02 Feb 2023
Simplex Random Features
Simplex Random Features
Isaac Reid
K. Choromanski
Valerii Likhosherstov
Adrian Weller
34
7
0
31 Jan 2023
Learning One Abstract Bit at a Time Through Self-Invented Experiments
  Encoded as Neural Networks
Learning One Abstract Bit at a Time Through Self-Invented Experiments Encoded as Neural Networks
Vincent Herrmann
Louis Kirsch
Jürgen Schmidhuber
AI4CE
46
4
0
29 Dec 2022
On Transforming Reinforcement Learning by Transformer: The Development
  Trajectory
On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Shengchao Hu
Li Shen
Ya Zhang
Yixin Chen
Dacheng Tao
OffRL
27
25
0
29 Dec 2022
Annotated History of Modern AI and Deep Learning
Annotated History of Modern AI and Deep Learning
Juergen Schmidhuber
MLAU
AI4TS
AI4CE
33
22
0
21 Dec 2022
Transformers learn in-context by gradient descent
Transformers learn in-context by gradient descent
J. Oswald
Eyvind Niklasson
E. Randazzo
João Sacramento
A. Mordvintsev
A. Zhmoginov
Max Vladymyrov
MLT
30
434
0
15 Dec 2022
Meta-Learning Fast Weight Language Models
Meta-Learning Fast Weight Language Models
Kevin Clark
Kelvin Guu
Ming-Wei Chang
Panupong Pasupat
Geoffrey E. Hinton
Mohammad Norouzi
KELM
32
13
0
05 Dec 2022
What learning algorithm is in-context learning? Investigations with
  linear models
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
34
441
0
28 Nov 2022
Learning to Control Rapidly Changing Synaptic Connections: An
  Alternative Type of Memory in Sequence Processing Artificial Neural Networks
Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks
Kazuki Irie
Jürgen Schmidhuber
KELM
24
1
0
17 Nov 2022
Characterizing Verbatim Short-Term Memory in Neural Language Models
Characterizing Verbatim Short-Term Memory in Neural Language Models
K. Armeni
C. Honey
Tal Linzen
KELM
RALM
30
3
0
24 Oct 2022
Modeling Context With Linear Attention for Scalable Document-Level
  Translation
Modeling Context With Linear Attention for Scalable Document-Level Translation
Zhaofeng Wu
Hao Peng
Nikolaos Pappas
Noah A. Smith
14
3
0
16 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
43
9
0
14 Oct 2022
Designing Robust Transformers using Robust Kernel Density Estimation
Designing Robust Transformers using Robust Kernel Density Estimation
Xing Han
Tongzheng Ren
T. Nguyen
Khai Nguyen
Joydeep Ghosh
Nhat Ho
23
6
0
11 Oct 2022
LARF: Two-level Attention-based Random Forests with a Mixture of
  Contamination Models
LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models
A. Konstantinov
Lev V. Utkin
38
0
0
11 Oct 2022
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
H. H. Mao
69
20
0
09 Oct 2022
Images as Weight Matrices: Sequential Image Generation Through Synaptic
  Learning Rules
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
Kazuki Irie
Jürgen Schmidhuber
37
5
0
07 Oct 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
23
2
0
11 Aug 2022
Learning to Generalize with Object-centric Agents in the Open World
  Survival Game Crafter
Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter
Aleksandar Stanić
Yujin Tang
David R Ha
Jürgen Schmidhuber
ELM
29
13
0
05 Aug 2022
AGBoost: Attention-based Modification of Gradient Boosting Machine
AGBoost: Attention-based Modification of Gradient Boosting Machine
A. Konstantinov
Lev V. Utkin
Stanislav R. Kirpichenko
ODL
13
7
0
12 Jul 2022
Attention and Self-Attention in Random Forests
Attention and Self-Attention in Random Forests
Lev V. Utkin
A. Konstantinov
40
3
0
09 Jul 2022
Goal-Conditioned Generators of Deep Policies
Goal-Conditioned Generators of Deep Policies
Francesco Faccio
Vincent Herrmann
Aditya A. Ramesh
Louis Kirsch
Jürgen Schmidhuber
OffRL
40
8
0
04 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
34
0
0
01 Jul 2022
Short-Term Plasticity Neurons Learning to Learn and Forget
Short-Term Plasticity Neurons Learning to Learn and Forget
Hector Garcia Rodriguez
Qinghai Guo
Timoleon Moraitis
13
12
0
28 Jun 2022
Neural Differential Equations for Learning to Program Neural Nets
  Through Continuous Learning Rules
Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules
Kazuki Irie
Francesco Faccio
Jürgen Schmidhuber
AI4TS
35
11
0
03 Jun 2022
Transformer with Fourier Integral Attentions
Transformer with Fourier Integral Attentions
T. Nguyen
Minh Pham
Tam Nguyen
Khai Nguyen
Stanley J. Osher
Nhat Ho
25
4
0
01 Jun 2022
BayesPCN: A Continually Learnable Predictive Coding Associative Memory
BayesPCN: A Continually Learnable Predictive Coding Associative Memory
Jason Yoo
F. Wood
KELM
94
9
0
20 May 2022
Minimal Neural Network Models for Permutation Invariant Agents
Minimal Neural Network Models for Permutation Invariant Agents
J. Pedersen
S. Risi
51
3
0
12 May 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
22
31
0
10 Apr 2022
On the link between conscious function and general intelligence in
  humans and machines
On the link between conscious function and general intelligence in humans and machines
Arthur Juliani
Kai Arulkumaran
Shuntaro Sasai
Ryota Kanai
34
26
0
24 Mar 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
20
5
0
23 Mar 2022
FAR: Fourier Aerial Video Recognition
FAR: Fourier Aerial Video Recognition
D. Kothandaraman
Tianrui Guan
Xijun Wang
Sean Hu
Ming-Shun Lin
Tianyi Zhou
21
13
0
21 Mar 2022
Block-Recurrent Transformers
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
23
94
0
11 Mar 2022
The Dual Form of Neural Networks Revisited: Connecting Test Time
  Predictions to Training Patterns via Spotlights of Attention
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
14
42
0
11 Feb 2022
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
14
26
0
11 Feb 2022
Latency Adjustable Transformer Encoder for Language Understanding
Latency Adjustable Transformer Encoder for Language Understanding
Sajjad Kachuee
M. Sharifkhani
37
0
0
10 Jan 2022
Attention-based Random Forest and Contamination Model
Attention-based Random Forest and Contamination Model
Lev V. Utkin
A. Konstantinov
26
29
0
08 Jan 2022
Previous
1234
Next