Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.08387
Cited By
v1
v2
v3 (latest)
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
18 April 2022
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking"
50 / 277 papers shown
Title
Beyond Document Page Classification: Design, Datasets, and Challenges
Jordy Van Landeghem
Sanket Biswas
Matthew B. Blaschko
Marie-Francine Moens
110
7
0
24 Aug 2023
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
Qiwei Li
Z. Li
Xiantao Cai
Bo Du
Hai Zhao
64
8
0
15 Aug 2023
A Graphical Approach to Document Layout Analysis
Jilin Wang
Michael Krumdick
Baojia Tong
Hamima Halim
M. Sokolov
Vadym Barda
Delphine Vendryes
Christy Tanner
61
9
0
03 Aug 2023
Multimodal Document Analytics for Banking Process Automation
C. Gerling
Stefan Lessmann
79
5
0
21 Jul 2023
PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
Kaiwen Wei
Jie Yao
Jingyuan Zhang
Yangyang Kang
Fubang Zhao
Yating Zhang
Changlong Sun
Xin Jin
Xin Zhang
51
4
0
20 Jul 2023
DocTr: Document Transformer for Structured Information Extraction in Documents
Haofu Liao
Aruni RoyChowdhury
Weijian Li
Ankan Bansal
Yuting Zhang
Zhuowen Tu
R. Satzoda
R. Manmatha
Vijay Mahadevan
70
12
0
16 Jul 2023
Towards Open Federated Learning Platforms: Survey and Vision from Technical and Legal Perspectives
Moming Duan
Qinbin Li
Linshan Jiang
Bingsheng He
FedML
105
5
0
05 Jul 2023
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Jiabo Ye
Anwen Hu
Haiyang Xu
Qinghao Ye
Mingshi Yan
...
Chenliang Li
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
VLM
MLLM
89
128
0
04 Jul 2023
Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
Tahira Shehzadi
K. Hashmi
D. Stricker
Marcus Liwicki
Muhammad Zeshan Afzal
122
7
0
23 Jun 2023
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
111
3
0
21 Jun 2023
DocumentNet: Bridging the Data Gap in Document Pre-Training
Lijun Yu
Jin Miao
Xiaoyu Sun
Jiayi Chen
Alexander G. Hauptmann
H. Dai
Wei Wei
31
3
0
15 Jun 2023
DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
Fuxiao Liu
Hao Tan
Chris Tensmeyer
CLIP
VLM
103
18
0
09 Jun 2023
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models
Jiabang He
Yilang Hu
Lei Wang
Xingdong Xu
Ning Liu
Hui-juan Liu
Hengtao Shen
VLM
OOD
68
3
0
05 Jun 2023
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
109
41
0
02 Jun 2023
Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents
Catherine Chen
Zejiang Shen
Dan Klein
Gabriel Stanovsky
Doug Downey
Kyle Lo
64
3
0
01 Jun 2023
End-to-End Document Classification and Key Information Extraction using Assignment Optimization
Ciaran Cooney
Joana Cavadas
Liam Madigan
Bradley Savage
Rachel Heyburn
Mairead O'Cuinn
56
0
0
01 Jun 2023
Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering
Wenjin Wang
Yunhao Li
Yixin Ou
Yin Zhang
VLM
131
26
0
01 Jun 2023
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Yi Tu
Ya Guo
Huan Chen
Jinyang Tang
64
15
0
30 May 2023
Alfred: A System for Prompted Weak Supervision
Peilin Yu
Stephen H. Bach
82
10
0
29 May 2023
GVdoc: Graph-based Visual Document Classification
Fnu Mohbat
Mohammed J Zaki
Catherine Finegan-Dollak
Ashish Verma
OOD
72
1
0
26 May 2023
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Geewook Kim
Hodong Lee
D. Kim
Haeji Jung
S. Park
Yoon Kim
Sangdoo Yun
Taeho Kil
Bado Lee
Seunghyun Park
VLM
105
4
0
24 May 2023
ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents
Christoph Auer
A. Nassar
Maksym Lysak
Michele Dolfi
Nikolaos Livathinos
Peter W. J. Staar
OOD
3DV
65
7
0
24 May 2023
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation
Prashant Krishnan
Zilong Wang
Yangkun Wang
Jingbo Shang
69
3
0
24 May 2023
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry
P. Kavehzadeh
Do Xuan Long
Enamul Hoque
Shafiq Joty
LRM
95
113
0
24 May 2023
RE
2
^2
2
: Region-Aware Relation Extraction from Visually Rich Documents
Pritika Ramu
Sijia Wang
Lalla Mouatadid
Joy Rimchala
Lifu Huang
52
0
0
24 May 2023
DUBLIN -- Document Understanding By Language-Image Network
Kriti Aggarwal
Aditi Khandelwal
Kumar Tanmay
Owais Mohammed Khan
Qiang Liu
Monojit Choudhury
Hardik Hansrajbhai Chauhan
Subhojit Som
Vishrav Chaudhary
Saurabh Tiwary
ObjD
VLM
117
0
0
23 May 2023
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document
Xiangnan Chen
Qianwen Xiao
Juncheng Li
Duo Dong
Jun Lin
Xiaozhong Liu
Siliang Tang
86
5
0
23 May 2023
Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing
C. Gérardin
Perceval Wajsburt
Basile Dura
Alice Calliger
Alexandre Mouchet
X. Tannier
R. Bey
62
1
0
23 May 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path
Zilong Wang
Jingbo Shang
82
0
0
23 May 2023
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Mingliang Zhai
Yulin Li
Xiameng Qin
Chen Yi
Qunyi Xie
Chengquan Zhang
Kun Yao
Yuwei Wu
Yunde Jia
44
8
0
19 May 2023
ICDAR 2023 Competition on Hierarchical Text Detection and Recognition
Shangbang Long
Siyang Qin
Dmitry Panteleev
Alessandro Bissacco
Yasuhisa Fujii
Michalis Raptis
VLM
115
20
0
16 May 2023
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding
ShuWei Feng
Tianyang Zhan
Zhanming Jie
Trung Quoc Luong
Xiaoran Jin
51
1
0
16 May 2023
Document Understanding Dataset and Evaluation (DUDE)
Jordy Van Landeghem
Rubèn Pérez Tito
Łukasz Borchmann
Michal Pietruszka
Pawel Józiak
...
Bertrand Ackaert
Ernest Valveny
Matthew Blaschko
Sien Moens
Tomasz Stanislawek
VGen
99
66
0
15 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
ViT
90
16
0
08 May 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Bhanu Prakash Voutharoja
Zhuang Li
Fatemeh Shiri
65
1
0
08 May 2023
Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation
Renshen Wang
Yasuhisa Fujii
Alessandro Bissacco
GNN
57
6
0
04 May 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Nils Loose
Chun-Liang Li
Hao Zhang
Timothy Dozat
Felix Mächtle
...
Shangbang Long
Siyang Qin
Yasuhisa Fujii
Nan Hua
T. Eisenbarth
SSL
94
20
0
04 May 2023
Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs
Fengbin Zhu
Chao Wang
Fuli Feng
Zifeng Ren
Moxin Li
Tat-Seng Chua
78
4
0
03 May 2023
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Subhajit Maity
Sanket Biswas
Siladittya Manna
Ayan Banerjee
Josep Lladós
Saumik Bhattacharya
Umapada Pal
85
5
0
01 May 2023
Information Redundancy and Biases in Public Document Information Extraction Benchmarks
S. Laatiri
Pirashanth Ratnamogan
Joel Tang
Laurent Lam
William Vanhuffel
Fabien Caspani
43
1
0
28 Apr 2023
DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents
M. Dhouib
G. Bettaieb
A. Shabou
73
22
0
24 Apr 2023
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis
Shuyong Wei
Nuo Xu
51
5
0
24 Apr 2023
Information Extraction from Documents: Question Answering vs Token Classification in real-world setups
Laurent Lam
Pirashanth Ratnamogan
Joel Tang
William Vanhuffel
Fabien Caspani
64
0
0
21 Apr 2023
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo
Changxu Cheng
Qi Zheng
Cong Yao
78
49
0
21 Apr 2023
Context-Aware Classification of Legal Document Pages
Pavlos Fragkogiannis
Martina Forster
Grace E. Lee
Dell Zhang
71
5
0
05 Apr 2023
The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces
Kyle Lo
Joseph Chee Chang
Andrew Head
Jonathan Bragg
Amy X. Zhang
...
Caroline M Wu
Jiangjiang Yang
Angele Zamarron
Marti A. Hearst
Daniel S. Weld
74
20
0
25 Mar 2023
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang
Rujiao Long
Pengfei Wang
Sibo Song
Humen Zhong
Wenqing Cheng
X. Bai
Cong Yao
72
22
0
23 Mar 2023
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction
Jiabang He
Lei Wang
Yingpeng Hu
Ning Liu
Hui-juan Liu
Xingdong Xu
Hengtao Shen
MLLM
78
46
0
09 Mar 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Yu Yu
Yulin Li
Chengquan Zhang
Xiaoqiang Zhang
Zengyuan Guo
Xiameng Qin
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
78
45
0
01 Mar 2023
Entry Separation using a Mixed Visual and Textual Language Model: Application to 19th century French Trade Directories
Bertrand Duménieu
Edwin Carlinet
N. Abadie
Joseph Chazalon
56
0
0
17 Feb 2023
Previous
1
2
3
4
5
6
Next