ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.00172
  4. Cited By
Generalization through Memorization: Nearest Neighbor Language Models
v1v2 (latest)

Generalization through Memorization: Nearest Neighbor Language Models

1 November 2019
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
    RALM
ArXiv (abs)PDFHTML

Papers citing "Generalization through Memorization: Nearest Neighbor Language Models"

47 / 597 papers shown
Title
Revisiting Simple Neural Probabilistic Language Models
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun
Mohit Iyyer
87
14
0
08 Apr 2021
Perspective, Survey and Trends: Public Driving Datasets and Toolsets for
  Autonomous Driving Virtual Test
Perspective, Survey and Trends: Public Driving Datasets and Toolsets for Autonomous Driving Virtual Test
Pengliang Ji
Li Ruan
Yunzhi Xue
Limin Xiao
Qian Dong
165
8
0
01 Apr 2021
A Neighbourhood Framework for Resource-Lean Content Flagging
A Neighbourhood Framework for Resource-Lean Content Flagging
Sheikh Muhammad Sarwar
Dimitrina Zlatkova
Momchil Hardalov
Yoan Dinkov
Isabelle Augenstein
Preslav Nakov
67
5
0
31 Mar 2021
BASE Layers: Simplifying Training of Large, Sparse Models
BASE Layers: Simplifying Training of Large, Sparse Models
M. Lewis
Shruti Bhosale
Tim Dettmers
Naman Goyal
Luke Zettlemoyer
MoE
224
285
0
30 Mar 2021
Structure Inducing Pre-Training
Structure Inducing Pre-Training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
110
21
0
18 Mar 2021
Retrieval Augmentation for Deep Neural Networks
Retrieval Augmentation for Deep Neural Networks
R. Ramos
Patrícia Pereira
Helena Moniz
Joao Paulo Carvalho
Bruno Martins
VLM
32
0
0
25 Feb 2021
When Attention Meets Fast Recurrence: Training Language Models with
  Reduced Compute
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Tao Lei
RALMVLM
170
49
0
24 Feb 2021
Leveraging Reinforcement Learning for evaluating Robustness of KNN
  Search Algorithms
Leveraging Reinforcement Learning for evaluating Robustness of KNN Search Algorithms
Pramod Vadiraja
Christoph Balada
OOD
51
1
0
10 Feb 2021
Adaptive Semiparametric Language Models
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELMRALM
108
100
0
04 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language
  Models
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
151
219
0
03 Feb 2021
CNN with large memory layers
CNN with large memory layers
R. Karimov
Yury Malkov
Karim Iskakov
Victor Lempitsky
50
0
0
27 Jan 2021
Data-to-text Generation by Splicing Together Nearest Neighbors
Data-to-text Generation by Splicing Together Nearest Neighbors
Sam Wiseman
A. Backurs
K. Stratos
88
9
0
20 Jan 2021
Diagnostic Captioning: A Survey
Diagnostic Captioning: A Survey
John Pavlopoulos
Vasiliki Kougia
Ion Androutsopoulos
D. Papamichail
3DVMedIm
157
30
0
18 Jan 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAMLRALM
420
1,399
0
17 Jan 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in
  Generative Transformers
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
112
48
0
01 Jan 2021
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
328
91
0
31 Dec 2020
FastIF: Scalable Influence Functions for Efficient Model Interpretation
  and Debugging
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging
Han Guo
Nazneen Rajani
Peter Hase
Joey Tianyi Zhou
Caiming Xiong
TDI
135
116
0
31 Dec 2020
Modifying Memories in Transformer Models
Modifying Memories in Transformer Models
Chen Zhu
A. S. Rawat
Manzil Zaheer
Srinadh Bhojanapalli
Daliang Li
Felix X. Yu
Sanjiv Kumar
KELM
130
203
0
01 Dec 2020
Cross-Domain Generalization Through Memorization: A Study of Nearest
  Neighbors in Neural Duplicate Question Detection
Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection
Yadollah Yaghoobzadeh
Alexandre Rochette
Timothy J. Hazen
OOD
26
1
0
22 Nov 2020
Language Models are Open Knowledge Graphs
Language Models are Open Knowledge Graphs
Chenguang Wang
Xiao Liu
Basel Alomair
SSLKELM
81
137
0
22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives
Limitations of Autoregressive Models and Their Alternatives
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
89
64
0
22 Oct 2020
Explaining and Improving Model Behavior with k Nearest Neighbor
  Representations
Explaining and Improving Model Behavior with k Nearest Neighbor Representations
Nazneen Rajani
Ben Krause
Wengpeng Yin
Tong Niu
R. Socher
Caiming Xiong
FAtt
72
34
0
18 Oct 2020
Example-Driven Intent Prediction with Observers
Example-Driven Intent Prediction with Observers
Shikib Mehri
Mihail Eric
72
40
0
17 Oct 2020
Large Product Key Memory for Pretrained Language Models
Large Product Key Memory for Pretrained Language Models
Gyuwan Kim
Tae-Hwan Jung
VLMKELM
94
2
0
08 Oct 2020
Learning to Recombine and Resample Data for Compositional Generalization
Learning to Recombine and Resample Data for Compositional Generalization
Ekin Akyürek
Afra Feyza Akyürek
Jacob Andreas
86
81
0
08 Oct 2020
Efficient Meta Lifelong-Learning with Limited Memory
Efficient Meta Lifelong-Learning with Limited Memory
Zirui Wang
Sanket Vaibhav Mehta
Barnabás Póczós
J. Carbonell
CLLKELM
81
76
0
06 Oct 2020
Nearest Neighbor Machine Translation
Nearest Neighbor Machine Translation
Urvashi Khandelwal
Angela Fan
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
96
288
0
01 Oct 2020
Case-Based Abductive Natural Language Inference
Case-Based Abductive Natural Language Inference
Marco Valentino
Mokanarangan Thayaparan
André Freitas
82
5
0
30 Sep 2020
Controllable Text Generation with Focused Variation
Controllable Text Generation with Focused Variation
Lei Shu
Alexandros Papangelis
Yi-Chia Wang
Gokhan Tur
Hu Xu
Zhaleh Feizollahi
Bing-Quan Liu
Piero Molino
86
11
0
25 Sep 2020
Grounded Compositional Outputs for Adaptive Language Modeling
Grounded Compositional Outputs for Adaptive Language Modeling
Nikolaos Pappas
Phoebe Mulcaire
Noah A. Smith
KELM
82
7
0
24 Sep 2020
Taking Notes on the Fly Helps BERT Pre-training
Taking Notes on the Fly Helps BERT Pre-training
Qiyu Wu
Chen Xing
Yatao Li
Guolin Ke
Di He
Tie-Yan Liu
56
10
0
04 Aug 2020
Neural Language Generation: Formulation, Methods, and Evaluation
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
160
30
0
31 Jul 2020
Neural Composition: Learning to Generate from Multiple Models
Neural Composition: Learning to Generate from Multiple Models
Denis Filimonov
R. Gadde
Ariya Rastrow
28
3
0
10 Jul 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense
  Text Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
164
1,241
0
01 Jul 2020
Learning Sparse Prototypes for Text Generation
Learning Sparse Prototypes for Text Generation
Junxian He
Taylor Berg-Kirkpatrick
Graham Neubig
88
23
0
29 Jun 2020
Train and You'll Miss It: Interactive Model Iteration with Weak
  Supervision and Pre-Trained Embeddings
Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Mayee F. Chen
Daniel Y. Fu
Frederic Sala
Sen Wu
Ravi Teja Mullapudi
Fait Poms
Kayvon Fatahalian
Christopher Ré
61
10
0
26 Jun 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
114
161
0
26 Jun 2020
A Simple Approach to Case-Based Reasoning in Knowledge Bases
A Simple Approach to Case-Based Reasoning in Knowledge Bases
Rajarshi Das
Ameya Godbole
Shehzaad Dhuliawala
Manzil Zaheer
Andrew McCallum
76
24
0
25 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
77
75
0
16 Jun 2020
BERT-kNN: Adding a kNN Search Component to Pretrained Language Models
  for Better QA
BERT-kNN: Adding a kNN Search Component to Pretrained Language Models for Better QA
Nora Kassner
Hinrich Schütze
RALM
119
69
0
02 May 2020
Augmenting Transformers with KNN-Based Composite Memory for Dialogue
Augmenting Transformers with KNN-Based Composite Memory for Dialogue
Angela Fan
Claire Gardent
Chloé Braud
Antoine Bordes
RALM
171
76
0
27 Apr 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLMAI4CECLL
227
2,454
0
23 Apr 2020
Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and
  Data Augmentation
Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation
Sajad Norouzi
David J. Fleet
Mohammad Norouzi
VLMDRL
65
3
0
09 Apr 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
184
2,133
0
10 Feb 2020
Improving Transformer Models by Reordering their Sublayers
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
87
88
0
10 Nov 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
403
1,926
0
17 Sep 2019
Billion-scale similarity search with GPUs
Billion-scale similarity search with GPUs
Jeff Johnson
Matthijs Douze
Hervé Jégou
469
3,759
0
28 Feb 2017
Previous
123...101112