Leveraging Large Language Models for Automated Definition Extraction with TaxoMatic A Case Study on Media Bias

1 April 2025

Abstract

This paper introduces TaxoMatic, a framework that leverages large language models to automate definition extraction from academic literature. Focusing on the media bias domain, the framework encompasses data collection, LLM-based relevance classification, and extraction of conceptual definitions. Evaluated on a dataset of 2,398 manually rated articles, the study demonstrates the frameworks effectiveness, with Claude-3-sonnet achieving the best results in both relevance classification and definition extraction. Future directions include expanding datasets and applying TaxoMatic to additional domains.

View on arXiv

@article{spinde2025_2504.00343,
  title={ Leveraging Large Language Models for Automated Definition Extraction with TaxoMatic A Case Study on Media Bias },
  author={ Timo Spinde and Luyang Lin and Smi Hinterreiter and Isao Echizen },
  journal={arXiv preprint arXiv:2504.00343},
  year={ 2025 }
}

Comments on this paper