ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.15195
38
0

Benchmarking Large Language Models for Handwritten Text Recognition

19 March 2025
Giorgia Crosilla
Lukas Klic
Giovanni Colavizza
ArXivPDFHTML
Abstract

Traditional machine learning models for Handwritten Text Recognition (HTR) rely on supervised training, requiring extensive manual annotations, and often produce errors due to the separation between layout and text processing. In contrast, Multimodal Large Language Models (MLLMs) offer a general approach to recognizing diverse handwriting styles without the need for model-specific training. The study benchmarks various proprietary and open-source LLMs against Transkribus models, evaluating their performance on both modern and historical datasets written in English, French, German, and Italian. In addition, emphasis is placed on testing the models' ability to autonomously correct previously generated outputs. Findings indicate that proprietary models, especially Claude 3.5 Sonnet, outperform open-source alternatives in zero-shot settings. MLLMs achieve excellent results in recognizing modern handwriting and exhibit a preference for the English language due to their pre-training dataset composition. Comparisons with Transkribus show no consistent advantage for either approach. Moreover, LLMs demonstrate limited ability to autonomously correct errors in zero-shot transcriptions.

View on arXiv
@article{crosilla2025_2503.15195,
  title={ Benchmarking Large Language Models for Handwritten Text Recognition },
  author={ Giorgia Crosilla and Lukas Klic and Giovanni Colavizza },
  journal={arXiv preprint arXiv:2503.15195},
  year={ 2025 }
}
Comments on this paper