Measuring Structural Distances between Texts

17 March 2014

U. Fahrenberg

Abstract

We define and use a new inter-textual distance which, contrary to other common approaches, not only measures differences in occurrences of words, but in occurrences of multi-word phrases. We show that this distance may easily be calculated and use it for statistical analysis of some sample corpuses of genuine and fake scientific papers.

View on arXiv

Comments on this paper