4
0

JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry

Anum Afzal
Alexandre Mercier
Florian Matthes
Abstract

Online platforms are increasingly interested in using Data-to-Text technologies to generate content and help their users. Unfortunately, traditional generative methods often fall into repetitive patterns, resulting in monotonous galleries of texts after only a few iterations. In this paper, we investigate LLM-based data-to-text approaches to automatically generate marketing texts that are of sufficient quality and diverse enough for broad adoption. We leverage Language Models such as T5, GPT-3.5, GPT-4, and LLaMa2 in conjunction with fine-tuning, few-shot, and zero-shot approaches to set a baseline for diverse marketing texts. We also introduce a metric JaccDiv to evaluate the diversity of a set of texts. This research extends its relevance beyond the music industry, proving beneficial in various fields where repetitive automated content generation is prevalent.

View on arXiv
@article{afzal2025_2504.20849,
  title={ JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry },
  author={ Anum Afzal and Alexandre Mercier and Florian Matthes },
  journal={arXiv preprint arXiv:2504.20849},
  year={ 2025 }
}
Comments on this paper