Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.11527
Cited By
Memory Transformer
20 June 2020
Andrey Kravchenko
Yuri Kuratov
Anton Peganov
Grigory V. Sapunov
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Memory Transformer"
20 / 20 papers shown
Title
Compact Recurrent Transformer with Persistent Memory
Edison Mucllari
Z. Daniels
David C. Zhang
Qiang Ye
CLL
VLM
54
0
0
02 May 2025
A generative approach to LLM harmfulness detection with special red flag tokens
Sophie Xhonneux
David Dobre
Mehrnaz Mohfakhami
Leo Schwinn
Gauthier Gidel
55
1
0
22 Feb 2025
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Michael Toker
Ido Galil
Hadas Orgad
Rinon Gal
Yoad Tewel
Gal Chechik
Yonatan Belinkov
DiffM
54
2
0
12 Jan 2025
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
188
1
0
20 Sep 2024
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
Zeyu Zhang
Akide Liu
Qi Chen
Feng Chen
Ian Reid
Richard Hartley
Bohan Zhuang
Hao Tang
Mamba
31
9
0
14 Jul 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
112
9
0
17 Apr 2024
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
37
63
0
11 Mar 2024
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Zhaoyang Zhang
Wenqi Shao
Yixiao Ge
Xiaogang Wang
Liang Feng
Ping Luo
19
2
0
20 Dec 2023
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Alsu Sagirova
Andrey Kravchenko
RALM
28
1
0
29 Nov 2023
Vision Transformers Need Registers
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
62
312
0
28 Sep 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
25
87
0
19 Apr 2023
Adaptive Computation with Elastic Input Sequence
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
31
19
0
30 Jan 2023
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Andrey Kravchenko
CLL
13
102
0
14 Jul 2022
Linearizing Transformer with Key-Value Memory
Yizhe Zhang
Deng Cai
22
5
0
23 Mar 2022
StoryDB: Broad Multi-language Narrative Dataset
Alexey Tikhonov
Igor Samenko
Ivan P. Yamshchikov
46
5
0
29 Sep 2021
Combining Transformers with Natural Language Explanations
Federico Ruggeri
Marco Lippi
Paolo Torroni
25
1
0
02 Sep 2021
MedGPT: Medical Concept Prediction from Clinical Narratives
Z. Kraljevic
Anthony Shek
D. Bean
R. Bendayan
J. Teo
Richard J. B. Dobson
LM&MA
AI4TS
MedIm
25
39
0
07 Jul 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,017
0
28 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1