Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.03274
Cited By
GMAT: Global Memory Augmentation for Transformers
5 June 2020
Ankit Gupta
Jonathan Berant
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GMAT: Global Memory Augmentation for Transformers"
17 / 17 papers shown
Title
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
62
0
0
21 Jan 2025
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
Zeyu Zhang
Akide Liu
Qi Chen
Feng Chen
Ian Reid
Richard Hartley
Bohan Zhuang
Hao Tang
Mamba
41
9
0
14 Jul 2024
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Alsu Sagirova
Andrey Kravchenko
RALM
33
1
0
29 Nov 2023
Associative Transformer
Yuwei Sun
H. Ochiai
Zhirong Wu
Stephen Lin
Ryota Kanai
ViT
65
0
0
22 Sep 2023
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
37
10
0
24 May 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
37
12
0
22 May 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
27
87
0
19 Apr 2023
SeDR: Segment Representation Learning for Long Documents Dense Retrieval
Junying Chen
Qingcai Chen
Dongfang Li
Yutao Huang
28
6
0
20 Nov 2022
Efficient Long-Text Understanding with Short-Text Models
Maor Ivgi
Uri Shaham
Jonathan Berant
VLM
38
76
0
01 Aug 2022
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Andrey Kravchenko
CLL
15
103
0
14 Jul 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
39
232
0
27 Jun 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
MedIm
37
90
0
17 Jun 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
59
293
0
27 Mar 2022
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
Huiyu Wang
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
55
527
0
01 Dec 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
33
3,944
0
10 Apr 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
255
580
0
12 Mar 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
458
2,592
0
03 Sep 2019
1