ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.16767
39
0

A Hybrid Approach to Information Retrieval and Answer Generation for Regulatory Texts

24 February 2025
Jhon Rayo
Raul de la Rosa
Mario Garrido
    AILaw
ArXivPDFHTML
Abstract

Regulatory texts are inherently long and complex, presenting significant challenges for information retrieval systems in supporting regulatory officers with compliance tasks. This paper introduces a hybrid information retrieval system that combines lexical and semantic search techniques to extract relevant information from large regulatory corpora. The system integrates a fine-tuned sentence transformer model with the traditional BM25 algorithm to achieve both semantic precision and lexical coverage. To generate accurate and comprehensive responses, retrieved passages are synthesized using Large Language Models (LLMs) within a Retrieval Augmented Generation (RAG) framework. Experimental results demonstrate that the hybrid system significantly outperforms standalone lexical and semantic approaches, with notable improvements in Recall@10 and MAP@10. By openly sharing our fine-tuned model and methodology, we aim to advance the development of robust natural language processing tools for compliance-driven applications in regulatory domains.

View on arXiv
@article{rayo2025_2502.16767,
  title={ A Hybrid Approach to Information Retrieval and Answer Generation for Regulatory Texts },
  author={ Jhon Rayo and Raul de la Rosa and Mario Garrido },
  journal={arXiv preprint arXiv:2502.16767},
  year={ 2025 }
}
Comments on this paper