ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08249
17
3

Regularized Training of Nearest Neighbor Language Models

16 September 2021
Jean-François Ton
Walter A. Talbott
Shuangfei Zhai
J. Susskind
    RALM
ArXivPDFHTML
Abstract

Including memory banks in a natural language processing architecture increases model capacity by equipping it with additional data at inference time. In this paper, we build upon kkkNN-LM \citep{khandelwal20generalization}, which uses a pre-trained language model together with an exhaustive kkkNN search through the training data (memory bank) to achieve state-of-the-art results. We investigate whether we can improve the kkkNN-LM performance by instead training a LM with the knowledge that we will be using a kkkNN post-hoc. We achieved significant improvement using our method on language modeling tasks on \texttt{WIKI-2} and \texttt{WIKI-103}. The main phenomenon that we encounter is that adding a simple L2 regularization on the activations (not weights) of the model, a transformer, improves the post-hoc kkkNN classification performance. We explore some possible reasons for this improvement. In particular, we find that the added L2 regularization seems to improve the performance for high-frequency words without deteriorating the performance for low frequency ones.

View on arXiv
Comments on this paper