The advancements in large language models (LLMs) have brought significant progress in NLP tasks. However, if a task cannot be fully described in prompts, the models could fail to carry out the task. In this paper, we propose a simple yet effective method to contextualize a task toward a LLM. The method utilizes (1) open-ended zero-shot inference from the entire dataset, (2) aggregate the inference results, and (3) finally incorporate the aggregated meta-information for the actual task. We show the effectiveness in text clustering tasks, empowering LLMs to perform text-to-text-based clustering and leading to improvements on several datasets. Furthermore, we explore the generated class labels for clustering, showing how the LLM understands the task through data.
View on arXiv@article{jo2025_2406.13342, title={ ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models }, author={ Hwiyeol Jo and Hyunwoo Lee and Kang Min Yoo and Taiwoo Park }, journal={arXiv preprint arXiv:2406.13342}, year={ 2025 } }