ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.13103
43
19

Finnish Paraphrase Corpus

24 March 2021
Jenna Kanerva
Filip Ginter
Li-Hsin Chang
Iiro Rastas
Valtteri Skantsi
Jemina Kilpeläinen
Hanna-Mari Kupari
Jenna Saarni
Maija Sevón
Otto Tarkka
ArXiv (abs)PDFHTML
Abstract

In this paper, we introduce the first fully manually annotated paraphrase corpus for Finnish containing 53,572 paraphrase pairs harvested from alternative subtitles and news headings. Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts. Additionally, we establish a manual candidate selection method and demonstrate its feasibility in high quality paraphrase selection in terms of both cost and quality.

View on arXiv
Comments on this paper