Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System

19 June 2025

Main:10 Pages

3 Tables

Abstract

Despite advances in machine learning (ML) and large language models (LLMs), rule-based natural language processing (NLP) systems remain active in clinical settings due to their interpretability and operational efficiency. However, their manual development and maintenance are labor-intensive, particularly in tasks with large linguistic variability. To overcome these limitations, we proposed a novel approach employing LLMs solely during the rule-based systems development phase. We conducted the initial experiments focusing on the first two steps of developing a rule-based NLP pipeline: find relevant snippets from the clinical note; extract informative keywords from the snippets for the rule-based named entity recognition (NER) component. Our experiments demonstrated exceptional recall in identifying clinically relevant text snippets (Deepseek: 0.98, Qwen: 0.99) and 1.0 in extracting key terms for NER. This study sheds light on a promising new direction for NLP development, enabling semi-automated or automated development of rule-based systems with significantly faster, more cost-effective, and transparent execution compared with deep learning model-based solutions.

View on arXiv

@article{shi2025_2506.16628,
  title={ Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System },
  author={ Jianlin Shi and Brian T. Bucher },
  journal={arXiv preprint arXiv:2506.16628},
  year={ 2025 }
}

Comments on this paper