ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.03738
27
0

ERASMO: Leveraging Large Language Models for Enhanced Clustering Segmentation

1 October 2024
Fillipe dos Santos Silva
Gabriel Kenzo Kakimoto
Julio Cesar dos Reis
Marcelo S. Reis
ArXivPDFHTML
Abstract

Cluster analysis plays a crucial role in various domains and applications, such as customer segmentation in marketing. These contexts often involve multimodal data, including both tabular and textual datasets, making it challenging to represent hidden patterns for obtaining meaningful clusters. This study introduces ERASMO, a framework designed to fine-tune a pretrained language model on textually encoded tabular data and generate embeddings from the fine-tuned model. ERASMO employs a textual converter to transform tabular data into a textual format, enabling the language model to process and understand the data more effectively. Additionally, ERASMO produces contextually rich and structurally representative embeddings through techniques such as random feature sequence shuffling and number verbalization. Extensive experimental evaluations were conducted using multiple datasets and baseline approaches. Our results demonstrate that ERASMO fully leverages the specific context of each tabular dataset, leading to more precise and nuanced embeddings for accurate clustering. This approach enhances clustering performance by capturing complex relationship patterns within diverse tabular data.

View on arXiv
@article{silva2025_2410.03738,
  title={ ERASMO: Leveraging Large Language Models for Enhanced Clustering Segmentation },
  author={ Fillipe dos Santos Silva and Gabriel Kenzo Kakimoto and Julio Cesar dos Reis and Marcelo S. Reis },
  journal={arXiv preprint arXiv:2410.03738},
  year={ 2025 }
}
Comments on this paper