ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02899
  4. Cited By
Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR
  documents

Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR documents

6 August 2021
Amit Gupte
Alexey Romanov
Sahitya Mantravadi
Dalitso Banda
Jianjie Liu
Raza Khan
Lakshmanan Ramu Meenal
Benjamin Han
Soundar Srinivasan
ArXivPDFHTML

Papers citing "Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR documents"

7 / 7 papers shown
Title
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
Jonathan Bourne
77
0
0
24 Feb 2025
Scrambled text: training Language Models to correct OCR errors using
  synthetic data
Scrambled text: training Language Models to correct OCR errors using synthetic data
Jonathan Bourne
SyDa
38
2
0
29 Sep 2024
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine
  Translation
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation
Benjamin Hsu
Xiaoyu Liu
Huayang Li
Yoshinari Fujinuma
Maria Nadejde
Xing Niu
Yair Kittenplon
Ron Litman
R. Pappagari
44
4
0
12 Jun 2024
Data Generation for Post-OCR correction of Cyrillic handwriting
Data Generation for Post-OCR correction of Cyrillic handwriting
Evgenii Davydkin
Aleksandr Markelov
Egor Iuldashev
Anton Dudkin
I. Krivorotov
42
3
0
27 Nov 2023
OCR Improves Machine Translation for Low-Resource Languages
OCR Improves Machine Translation for Low-Resource Languages
Oana Ignat
Jean Maillard
Vishrav Chaudhary
Francisco Guzmán
37
10
0
27 Feb 2022
Sim2Real Docs: Domain Randomization for Documents in Natural Scenes
  using Ray-traced Rendering
Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering
Nikhil Maddikunta
Huijun Zhao
Sumit Keswani
Alfy Samuel
Fu-Ming Guo
Nishan Srishankar
Vishwa Pardeshi
Austin Huang
VGen
26
1
0
16 Dec 2021
Synthetic Document Generator for Annotation-free Layout Recognition
Synthetic Document Generator for Annotation-free Layout Recognition
Natraj Raman
Sameena Shah
Manuela Veloso
34
10
0
11 Nov 2021
1