VRDU: A Benchmark for Visually-rich Document Understanding

v1v2v3 (latest)

VRDU: A Benchmark for Visually-rich Document Understanding

15 November 2022

Chen-Yu Lee

ArXiv (abs)PDF HTML

Papers citing "VRDU: A Benchmark for Visually-rich Document Understanding"

12 / 12 papers shown

Title
Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning Ye Mo Zirui Shao Kai Ye Xianwei Mao Bo Zhang ... Gang Huang Kehan Chen Zhou Huan Zixu Yan Sheng Zhou LRM 53 0 0 24 May 2025
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang Tengchao Lv Lei Cui Yutong Lu Furu Wei 95 458 0 18 Apr 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction Chen-Yu Lee Chun-Liang Li Timothy Dozat Vincent Perot Guolong Su Nan Hua Joshua Ainslie Renshen Wang Yasuhisa Fujii Tomas Pfister 73 79 0 16 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding Jiapeng Wang Lianwen Jin Kai Ding VLM 69 142 0 28 Feb 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA Ali Furkan Biten Ron Litman Yusheng Xie Srikar Appalaraju R. Manmatha ViT 95 102 0 23 Dec 2021
DocFormer: End-to-End Transformer for Document Understanding Srikar Appalaraju Bhavan A. Jasani Bhargava Urala Kota Yusheng Xie R. Manmatha ViT 88 279 0 22 Jun 2021
Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts Tomasz Stanislawek Filip Graliñski Anna Wróblewska Dawid Lipiñski Agnieszka Kaliska Paulina Rosalska Bartosz Topolski P. Biecek 75 95 0 12 May 2021
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction Zheng Huang Kai Chen Jianhua He X. Bai Dimosthenis Karatzas Shijian Lu C. V. Jawahar 57 319 0 18 Mar 2021
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer Rafal Powalski Łukasz Borchmann Dawid Jurkiewicz Tomasz Dwojak Michal Pietruszka Gabriela Pałka ViT 66 159 0 18 Feb 2021
DocVQA: A Dataset for VQA on Document Images Minesh Mathew Dimosthenis Karatzas C. V. Jawahar 144 743 0 01 Jul 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu Minghao Li Lei Cui Shaohan Huang Furu Wei Ming Zhou 135 707 0 31 Dec 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 168 370 0 27 May 2019