391
1

A City of Millions: Mapping Literary Social Networks At Scale

Abstract

We release 70,509 high-quality social networks extracted from multilingual fiction and nonfiction narratives. We additionally provide metadata for \sim30,000 of these texts (73\% nonfiction and 27\% fiction) written between 1800 and 1999 in 58 languages. This dataset provides information on historical social worlds at an unprecedented scale, including data for 2,510,021 individuals in 2,805,482 pair-wise relationships annotated for affinity and relationship type. We achieve this scale by automating previously manual methods of extracting social networks; specifically, we adapt an existing annotation task as a language model prompt, ensuring consistency at scale with the use of structured output. This dataset serves as a unique resource for humanities and social science research by providing data on cognitive models of social realities.

View on arXiv
@article{hamilton2025_2502.19590,
  title={ A City of Millions: Mapping Literary Social Networks At Scale },
  author={ Sil Hamilton and Rebecca M. M. Hicke and David Mimno and Matthew Wilkens },
  journal={arXiv preprint arXiv:2502.19590},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.