ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.10939
  4. Cited By
Unified Pretraining Framework for Document Understanding

Unified Pretraining Framework for Document Understanding

22 April 2022
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
Nikolaos Barmpalios
R. Jain
A. Nenkova
Tong Sun
ArXivPDFHTML

Papers citing "Unified Pretraining Framework for Document Understanding"

25 / 75 papers shown
Title
Document Understanding Dataset and Evaluation (DUDE)
Document Understanding Dataset and Evaluation (DUDE)
Jordy Van Landeghem
Rubèn Pérez Tito
Łukasz Borchmann
Michal Pietruszka
Pawel Józiak
...
Bertrand Ackaert
Ernest Valveny
Matthew Blaschko
Sien Moens
Tomasz Stanislawek
VGen
24
53
0
15 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for
  Document Instance Segmentation
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
ViT
20
16
0
08 May 2023
Text Reading Order in Uncontrolled Conditions by Sparse Graph
  Segmentation
Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation
Renshen Wang
Yasuhisa Fujii
Alessandro Bissacco
GNN
28
6
0
04 May 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document
  Information Extraction
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Nils Loose
Chun-Liang Li
Hao Zhang
Timothy Dozat
Felix Mächtle
...
Shangbang Long
Siyang Qin
Yasuhisa Fujii
Nan Hua
T. Eisenbarth
SSL
48
19
0
04 May 2023
SelfDocSeg: A Self-Supervised vision-based Approach towards Document
  Segmentation
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Subhajit Maity
Sanket Biswas
Siladittya Manna
Ayan Banerjee
Josep Lladós
Saumik Bhattacharya
Umapada Pal
36
5
0
01 May 2023
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis
Shuyong Wei
Nuo Xu
27
5
0
24 Apr 2023
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo
Changxu Cheng
Qi Zheng
Cong Yao
35
44
0
21 Apr 2023
Modeling Entities as Semantic Points for Visual Information Extraction
  in the Wild
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang
Rujiao Long
Pengfei Wang
Sibo Song
Humen Zhong
Wenqing Cheng
X. Bai
Cong Yao
41
22
0
23 Mar 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image
  Pre-training
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Yu Yu
Yulin Li
Chengquan Zhang
Xiaoqiang Zhang
Zengyuan Guo
Xiameng Qin
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
24
45
0
01 Mar 2023
DocILE Benchmark for Document Information Localization and Extraction
DocILE Benchmark for Document Information Localization and Extraction
vStvepán vSimsa
Milan vSulc
Michal Uvrivcávr
Yash J. Patel
Ahmed Hamdi
...
Matyávs Skalický
Jivrí Matas
Antoine Doucet
Mickael Coustaty
Dimosthenis Karatzas
24
34
0
11 Feb 2023
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document
  Understanding
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
Haoli Bai
Zhiguang Liu
Xiaojun Meng
Wentao Li
Shuangning Liu
...
Liangwei Wang
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
ViT
35
13
0
19 Dec 2022
Unifying Vision, Text, and Layout for Universal Document Processing
Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang
Ziyi Yang
Guoxin Wang
Yuwei Fang
Yang Liu
Chenguang Zhu
Michael Zeng
Chao-Yue Zhang
Joey Tianyi Zhou
VLM
32
107
0
05 Dec 2022
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image
  Understanding
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding
Zilong Wang
Jiuxiang Gu
Chris Tensmeyer
Nikolaos Barmpalios
A. Nenkova
Tong Sun
Jingbo Shang
Vlad I. Morariu
VLM
25
12
0
27 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image
  Models
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
41
2
0
27 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson
Gordon Lim
Yutong Ai
David Kuang
Kevin Leach
OODD
OOD
37
18
0
14 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
52
13
0
06 Oct 2022
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout
  Analysis
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Siwen Luo
Yi Ding
Siqu Long
Josiah Poon
S. Han
GNN
45
16
0
22 Aug 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Song Tao
Zijian Wang
Tiantian Fan
Canjie Luo
Can Huang
SSL
40
2
0
28 Jul 2022
Test-Time Adaptation for Visual Document Understanding
Test-Time Adaptation for Visual Document Understanding
Sayna Ebrahimi
Sercan Ö. Arik
Tomas Pfister
OOD
35
6
0
15 Jun 2022
Relational Representation Learning in Visually-Rich Documents
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
20
12
0
05 May 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image
  Masking
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
35
436
0
18 Apr 2022
Multimodal Pre-training Based on Graph Attention Network for Document
  Understanding
Multimodal Pre-training Based on Graph Attention Network for Document Understanding
Zhenrong Zhang
Jiefeng Ma
Jun Du
Licheng Wang
Jianshu Zhang
23
37
0
25 Mar 2022
OCR-IDL: OCR Annotations for Industry Document Library Dataset
OCR-IDL: OCR Annotations for Industry Document Library Dataset
Ali Furkan Biten
Rubèn Pérez Tito
Lluís Gómez
Ernest Valveny
Dimosthenis Karatzas
27
26
0
25 Feb 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document
  Understanding
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
153
501
0
29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
143
357
0
27 May 2019
Previous
12