ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.11527
  4. Cited By
Memory Transformer

Memory Transformer

20 June 2020
Andrey Kravchenko
Yuri Kuratov
Anton Peganov
Grigory V. Sapunov
    RALM
ArXivPDFHTML

Papers citing "Memory Transformer"

20 / 20 papers shown
Title
Compact Recurrent Transformer with Persistent Memory
Compact Recurrent Transformer with Persistent Memory
Edison Mucllari
Z. Daniels
David C. Zhang
Qiang Ye
CLL
VLM
54
0
0
02 May 2025
A generative approach to LLM harmfulness detection with special red flag tokens
A generative approach to LLM harmfulness detection with special red flag tokens
Sophie Xhonneux
David Dobre
Mehrnaz Mohfakhami
Leo Schwinn
Gauthier Gidel
55
1
0
22 Feb 2025
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Michael Toker
Ido Galil
Hadas Orgad
Rinon Gal
Yoad Tewel
Gal Chechik
Yonatan Belinkov
DiffM
54
2
0
12 Jan 2025
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
188
1
0
20 Sep 2024
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long
  Motion Generation
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
Zeyu Zhang
Akide Liu
Qi Chen
Feng Chen
Ian Reid
Richard Hartley
Bohan Zhuang
Hao Tang
Mamba
31
9
0
14 Jul 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
112
9
0
17 Apr 2024
The pitfalls of next-token prediction
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
37
63
0
11 Mar 2024
Cached Transformers: Improving Transformers with Differentiable Memory
  Cache
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Zhaoyang Zhang
Wenqi Shao
Yixiao Ge
Xiaogang Wang
Liang Feng
Ping Luo
19
2
0
20 Dec 2023
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Alsu Sagirova
Andrey Kravchenko
RALM
28
1
0
29 Nov 2023
Vision Transformers Need Registers
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
62
312
0
28 Sep 2023
Scaling Transformer to 1M tokens and beyond with RMT
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
25
87
0
19 Apr 2023
Adaptive Computation with Elastic Input Sequence
Adaptive Computation with Elastic Input Sequence
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
31
19
0
30 Jan 2023
Token Turing Machines
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
Recurrent Memory Transformer
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Andrey Kravchenko
CLL
13
102
0
14 Jul 2022
Linearizing Transformer with Key-Value Memory
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
22
5
0
23 Mar 2022
StoryDB: Broad Multi-language Narrative Dataset
StoryDB: Broad Multi-language Narrative Dataset
Alexey Tikhonov
Igor Samenko
Ivan P. Yamshchikov
46
5
0
29 Sep 2021
Combining Transformers with Natural Language Explanations
Combining Transformers with Natural Language Explanations
Federico Ruggeri
Marco Lippi
Paolo Torroni
25
1
0
02 Sep 2021
MedGPT: Medical Concept Prediction from Clinical Narratives
MedGPT: Medical Concept Prediction from Clinical Narratives
Z. Kraljevic
Anthony Shek
D. Bean
R. Bendayan
J. Teo
Richard J. B. Dobson
LM&MA
AI4TS
MedIm
25
39
0
07 Jul 2021
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,017
0
28 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1