ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.08727
173
0

Training Plug-n-Play Knowledge Modules with Deep Context Distillation

11 March 2025
Lucas Page-Caccia
Alan Ansell
E. Ponti
Ivan Vulić
Alessandro Sordoni
    SyDa
ArXivPDFHTML
Abstract

Dynamically integrating new or rapidly evolving information after (Large) Language Model pre-training remains challenging, particularly in low-data scenarios or when dealing with private and specialized documents. In-context learning and retrieval-augmented generation (RAG) face limitations, including their high inference costs and their inability to capture global document information. In this paper, we propose a way of modularizing knowledge by training document-level Knowledge Modules (KMs). KMs are lightweight components implemented as parameter-efficient LoRA modules, which are trained to store information about new documents and can be easily plugged into models on demand. We show that next-token prediction performs poorly as the training objective for KMs. We instead propose Deep Context Distillation: we learn KMs parameters such as to simulate hidden states and logits of a teacher that takes the document in context. Our method outperforms standard next-token prediction and pre-instruction training techniques, across two datasets. Finally, we highlight synergies between KMs and RAG.

View on arXiv
@article{caccia2025_2503.08727,
  title={ Training Plug-n-Play Knowledge Modules with Deep Context Distillation },
  author={ Lucas Caccia and Alan Ansell and Edoardo Ponti and Ivan Vulić and Alessandro Sordoni },
  journal={arXiv preprint arXiv:2503.08727},
  year={ 2025 }
}
Comments on this paper