Post-OCR Text Correction for Bulgarian Historical Documents

Post-OCR Text Correction for Bulgarian Historical Documents

31 August 2024

Dimitar Dimitrov

Momchil Hardalov

Preslav Nakov

Papers citing "Post-OCR Text Correction for Bulgarian Historical Documents"

11 / 11 papers shown

Title
Entropy Heat-Mapping: Localizing GPT-Based OCR Errors with Sliding-Window Shannon Analysis Alexei Kaltchenko 72 0 0 30 Apr 2025
OCR Post Correction for Endangered Language Texts Shruti Rijhwani Antonios Anastasopoulos Graham Neubig 40 46 0 10 Nov 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention Pengcheng He Xiaodong Liu Jianfeng Gao Weizhu Chen AAML 135 2,730 0 05 Jun 2020
Unsupervised Cross-lingual Representation Learning at Scale Alexis Conneau Kartikay Khandelwal Naman Goyal Vishrav Chaudhary Guillaume Wenzek Francisco Guzmán Edouard Grave Myle Ott Luke Zettlemoyer Veselin Stoyanov 199 6,546 0 05 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 218 7,498 0 02 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 582 24,422 0 26 Jul 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,729 0 11 Oct 2018
Get To The Point: Summarization with Pointer-Generator Networks A. See Peter J. Liu Christopher D. Manning 3DPC 278 4,019 0 14 Apr 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Zhiwen Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 891 6,787 0 26 Sep 2016
Bag of Tricks for Efficient Text Classification Armand Joulin Edouard Grave Piotr Bojanowski Tomas Mikolov VLM 168 4,618 0 06 Jul 2016
Neural Machine Translation of Rare Words with Subword Units Rico Sennrich Barry Haddow Alexandra Birch 207 7,734 0 31 Aug 2015