Unified Pretraining Framework for Document Understanding

22 April 2022

Jiuxiang Gu

Papers citing "Unified Pretraining Framework for Document Understanding"

25 / 75 papers shown

Title
Document Understanding Dataset and Evaluation (DUDE) Jordy Van Landeghem Rubèn Pérez Tito Łukasz Borchmann Michal Pietruszka Pawel Józiak ... Bertrand Ackaert Ernest Valveny Matthew Blaschko Sien Moens Tomasz Stanislawek VGen 24 53 0 15 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation Ayan Banerjee Sanket Biswas Josep Lladós Umapada Pal ViT 20 16 0 08 May 2023
Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation Renshen Wang Yasuhisa Fujii Alessandro Bissacco GNN 28 6 0 04 May 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction Nils Loose Chun-Liang Li Hao Zhang Timothy Dozat Felix Mächtle ... Shangbang Long Siyang Qin Yasuhisa Fujii Nan Hua T. Eisenbarth SSL 48 19 0 04 May 2023
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation Subhajit Maity Sanket Biswas Siladittya Manna Ayan Banerjee Josep Lladós Saumik Bhattacharya Umapada Pal 36 5 0 01 May 2023
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis Shuyong Wei Nuo Xu 27 5 0 24 Apr 2023
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction Chuwei Luo Changxu Cheng Qi Zheng Cong Yao 35 44 0 21 Apr 2023
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild Zhibo Yang Rujiao Long Pengfei Wang Sibo Song Humen Zhong Wenqing Cheng X. Bai Cong Yao 41 22 0 23 Mar 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training Yu Yu Yulin Li Chengquan Zhang Xiaoqiang Zhang Zengyuan Guo Xiameng Qin Kun Yao Junyu Han Errui Ding Jingdong Wang 24 45 0 01 Mar 2023
DocILE Benchmark for Document Information Localization and Extraction vStvepán vSimsa Milan vSulc Michal Uvrivcávr Yash J. Patel Ahmed Hamdi ... Matyávs Skalický Jivrí Matas Antoine Doucet Mickael Coustaty Dimosthenis Karatzas 24 34 0 11 Feb 2023
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding Haoli Bai Zhiguang Liu Xiaojun Meng Wentao Li Shuangning Liu ... Liangwei Wang Lu Hou Jiansheng Wei Xin Jiang Qun Liu ViT 35 13 0 19 Dec 2022
Unifying Vision, Text, and Layout for Universal Document Processing Zineng Tang Ziyi Yang Guoxin Wang Yuwei Fang Yang Liu Chenguang Zhu Michael Zeng Chao-Yue Zhang Joey Tianyi Zhou VLM 32 107 0 05 Dec 2022
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding Zilong Wang Jiuxiang Gu Chris Tensmeyer Nikolaos Barmpalios A. Nenkova Tong Sun Jingbo Shang Vlad I. Morariu VLM 25 12 0 27 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models Lei Wang Jian He Xingdong Xu Ning Liu Hui-juan Liu 41 2 0 27 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers Stefan Larson Gordon Lim Yutong Ai David Kuang Kevin Leach OODD OOD 37 18 0 14 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding Jingye Chen Tengchao Lv Lei Cui Changrong Zhang Furu Wei 52 13 0 06 Oct 2022
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis Siwen Luo Yi Ding Siqu Long Josiah Poon S. Han GNN 45 16 0 22 Aug 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding Song Tao Zijian Wang Tiantian Fan Canjie Luo Can Huang SSL 40 2 0 28 Jul 2022
Test-Time Adaptation for Visual Document Understanding Sayna Ebrahimi Sercan Ö. Arik Tomas Pfister OOD 35 6 0 15 Jun 2022
Relational Representation Learning in Visually-Rich Documents Xin Li Yan Zheng Yiqing Hu H. Cao Yunfei Wu Deqiang Jiang Yinsong Liu Bo Ren 20 12 0 05 May 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang Tengchao Lv Lei Cui Yutong Lu Furu Wei 35 436 0 18 Apr 2022
Multimodal Pre-training Based on Graph Attention Network for Document Understanding Zhenrong Zhang Jiefeng Ma Jun Du Licheng Wang Jianshu Zhang 23 37 0 25 Mar 2022
OCR-IDL: OCR Annotations for Industry Document Library Dataset Ali Furkan Biten Rubèn Pérez Tito Lluís Gómez Ernest Valveny Dimosthenis Karatzas 27 26 0 25 Feb 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu Yiheng Xu Tengchao Lv Lei Cui Furu Wei ... D. Florêncio Cha Zhang Wanxiang Che Min Zhang Lidong Zhou ViT MLLM 153 501 0 29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 143 357 0 27 May 2019