44
0

Large Language Models in the Task of Automatic Validation of Text Classifier Predictions

Main:40 Pages
10 Figures
Bibliography:1 Pages
18 Tables
Appendix:5 Pages
Abstract

Machine learning models for text classification are trained to predict a class for a given text. To do this, training and validation samples must be prepared: a set of texts is collected, and each text is assigned a class. These classes are usually assigned by human annotators with different expertise levels, depending on the specific classification task. Collecting such samples from scratch is labor-intensive because it requires finding specialists and compensating them for their work; moreover, the number of available specialists is limited, and their productivity is constrained by human factors. While it may not be too resource-intensive to collect samples once, the ongoing need to retrain models (especially in incremental learning pipelines) to address data drift (also called model drift) makes the data collection process crucial and costly over the model's entire lifecycle. This paper proposes several approaches to replace human annotators with Large Language Models (LLMs) to test classifier predictions for correctness, helping ensure model quality and support high-quality incremental learning.

View on arXiv
@article{tsymbalov2025_2505.18688,
  title={ Large Language Models in the Task of Automatic Validation of Text Classifier Predictions },
  author={ Aleksandr Tsymbalov },
  journal={arXiv preprint arXiv:2505.18688},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.