Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines

30 November 2023

Abstract

Rhetoric, both spoken and written, involves not only content but also style. One common stylistic tool is $\textit{parallelism}$ : the juxtaposition of phrases which have the same sequence of linguistic ( $\textit{e.g.}$ , phonological, syntactic, semantic) features. Despite the ubiquity of parallelism, the field of natural language processing has seldom investigated it, missing a chance to better understand the nature of the structure, meaning, and intent that humans convey. To address this, we introduce the task of $\textit{rhetorical parallelism detection}$ . We construct a formal definition of it; we provide one new Latin dataset and one adapted Chinese dataset for it; we establish a family of metrics to evaluate performance on it; and, lastly, we create baseline systems and novel sequence labeling schemes to capture it. On our strictest metric, we attain $F_{1}$ scores of $0.40$ and $0.43$ on our Latin and Chinese datasets, respectively.

View on arXiv

Comments on this paper