Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13421
Cited By
Long-range Language Modeling with Self-retrieval
23 June 2023
Ohad Rubin
Jonathan Berant
RALM
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-range Language Modeling with Self-retrieval"
25 / 25 papers shown
Title
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Andrey Kravchenko
68
3
0
17 Feb 2025
Retrieval Augmented Spelling Correction for E-Commerce Applications
Xuan Guo
Rohit Patki
Dante Everaert
Christopher Potts
19
0
0
15 Oct 2024
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
ALM
LRM
ReLM
ELM
51
61
0
14 Jun 2024
Reliable, Adaptable, and Attributable Language Models with Retrieval
Akari Asai
Zexuan Zhong
Danqi Chen
Pang Wei Koh
Luke Zettlemoyer
Hanna Hajishirzi
Wen-tau Yih
KELM
RALM
49
54
0
05 Mar 2024
Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet?
E. Razumovskaia
Ivan Vulić
Anna Korhonen
46
6
0
04 Mar 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
61
117
0
29 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
119
33
0
16 Feb 2024
Accelerating Retrieval-Augmented Language Model Serving with Speculation
Zhihao Zhang
Alan Zhu
Lijie Yang
Yihua Xu
Lanting Li
P. Phothilimthana
Zhihao Jia
RALM
KELM
56
16
0
25 Jan 2024
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems
Hongru Wang
Wenyu Huang
Yang Deng
Rui Wang
Zezhong Wang
Yufei Wang
Fei Mi
Jeff Z. Pan
Kam-Fai Wong
RALM
49
27
0
24 Jan 2024
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
Sehyun Choi
Tianqing Fang
Zhaowei Wang
Yangqiu Song
35
33
0
13 Oct 2023
CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving
Yuhan Liu
Hanchen Li
Yihua Cheng
Siddhant Ray
Yuyang Huang
...
Ganesh Ananthanarayanan
Michael Maire
Henry Hoffmann
Ari Holtzman
Junchen Jiang
50
42
0
11 Oct 2023
Making Retrieval-Augmented Language Models Robust to Irrelevant Context
Ori Yoran
Tomer Wolfson
Ori Ram
Jonathan Berant
RALM
LRM
24
185
0
02 Oct 2023
Attention Sorting Combats Recency Bias In Long Context Language Models
A. Peysakhovich
Adam Lerer
LRM
RALM
49
43
0
28 Sep 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Shi Wang
Dacheng Tao
LLMAG
KELM
RALM
70
24
0
29 Aug 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
70
538
0
12 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
40
1,424
0
06 Jul 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
116
122
0
02 May 2023
Resurrecting Recurrent Neural Networks for Long Sequences
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
88
271
0
11 Mar 2023
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
247
128
0
25 May 2022
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval
Luyu Gao
Jamie Callan
RALM
175
330
0
12 Aug 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,000
0
31 Dec 2020
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
230
89
0
31 Dec 2020
Distilling Knowledge from Reader to Retriever for Question Answering
Gautier Izacard
Edouard Grave
RALM
185
251
0
08 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,023
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
252
580
0
12 Mar 2020
1