ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.02268
35
0

Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data

3 April 2025
Waris Gill
Justin Cechmanek
Tyler Hutcherson
Srijith Rajamohan
Jen Agarwal
Muhammad Ali Gulzar
Manvinder Singh
Benoit Dion
ArXivPDFHTML
Abstract

This report investigates enhancing semantic caching effectiveness by employing specialized, fine-tuned embedding models. Semantic caching relies on embedding similarity rather than exact key matching, presenting unique challenges in balancing precision, query latency, and computational efficiency. We propose leveraging smaller, domain-specific embedding models, fine-tuned with targeted real-world and synthetically generated datasets. Our empirical evaluations demonstrate that compact embedding models fine-tuned for just one epoch on specialized datasets significantly surpass both state-of-the-art open-source and proprietary alternatives in precision and recall. Moreover, we introduce a novel synthetic data generation pipeline for the semantic cache that mitigates the challenge of limited domain-specific annotated data, further boosting embedding performance. Our approach effectively balances computational overhead and accuracy, establishing a viable and efficient strategy for practical semantic caching implementations.

View on arXiv
@article{gill2025_2504.02268,
  title={ Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data },
  author={ Waris Gill and Justin Cechmanek and Tyler Hutcherson and Srijith Rajamohan and Jen Agarwal and Muhammad Ali Gulzar and Manvinder Singh and Benoit Dion },
  journal={arXiv preprint arXiv:2504.02268},
  year={ 2025 }
}
Comments on this paper