Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.13318
Cited By
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
31 December 2019
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LayoutLM: Pre-training of Text and Layout for Document Image Understanding"
50 / 371 papers shown
Title
DM
2
^2
2
S
2
^2
2
: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention
Shunsuke Kitada
Yuki Iwazaki
Riku Togashi
Hitoshi Iyatomi
21
1
0
07 Sep 2022
Graph Neural Networks and Representation Embedding for Table Extraction in PDF Documents
Andrea Gemelli
Emanuele Vivoli
S. Marinai
LMTD
31
11
0
23 Aug 2022
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks
Andrea Gemelli
Sanket Biswas
Enrico Civitelli
Josep Lladós
S. Marinai
23
16
0
23 Aug 2022
TaCo: Textual Attribute Recognition via Contrastive Learning
Chang Nie
Yiqing Hu
Yanqiu Qu
Hao Liu
Deqiang Jiang
Bo Ren
32
0
0
22 Aug 2022
Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Siwen Luo
Yi Ding
Siqu Long
Josiah Poon
S. Han
GNN
45
16
0
22 Aug 2022
Understanding Long Documents with Different Position-Aware Attentions
Hai Pham
Guoxin Wang
Yijuan Lu
D. Florêncio
Changrong Zhang
27
9
0
17 Aug 2022
Information Extraction from Scanned Invoice Images using Text Analysis and Layout Features
H. Ha
Ales Horak
25
14
0
08 Aug 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Song Tao
Zijian Wang
Tiantian Fan
Canjie Luo
Can Huang
SSL
40
2
0
28 Jul 2022
Contextual Text Block Detection towards Scene Text Understanding
Chuhui Xue
Jiaxing Huang
Shijian Lu
Changhu Wang
Song Bai
27
7
0
26 Jul 2022
Towards Complex Document Understanding By Discrete Reasoning
Fengbin Zhu
Wenqiang Lei
Fuli Feng
Chao Wang
Haozhou Zhang
Tat-Seng Chua
31
43
0
25 Jul 2022
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration
Zhenyu Zhang
Yu Bowen
Haiyang Yu
Tingwen Liu
Cheng Fu
Jingyang Li
Chengguang Tang
Jian Sun
Yongbin Li
39
6
0
14 Jul 2022
GMN: Generative Multi-modal Network for Practical Document Information Extraction
H. Cao
Jiefeng Ma
Antai Guo
Yiqing Hu
Hao Liu
Deqiang Jiang
Yinsong Liu
Bo Ren
26
8
0
11 Jul 2022
Sequence-aware multimodal page classification of Brazilian legal documents
Pedro Henrique Luz de Araujo
Ana Paula G. S. de Almeida
Fabricio Ataides Braz
Nilton Correia da Silva
Flávio de Barros Vidal
Teofilo de Campos
27
7
0
02 Jul 2022
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding
Chuwei Luo
Guozhi Tang
Qi Zheng
Cong Yao
Lianwen Jin
Chenliang Li
Yang Xue
Luo Si
32
16
0
27 Jun 2022
Business Document Information Extraction: Towards Practical Benchmarks
Matyás Skalický
Stepán Simsa
Michal Uřičář
Milan Šulc
33
9
0
20 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
60
352
0
17 Jun 2022
Test-Time Adaptation for Visual Document Understanding
Sayna Ebrahimi
Sercan Ö. Arik
Tomas Pfister
OOD
35
6
0
15 Jun 2022
RDU: A Region-based Approach to Form-style Document Understanding
Fengbin Zhu
Chao Wang
Wenqiang Lei
Ziyang Liu
Tat-Seng Chua
30
2
0
14 Jun 2022
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
B. Pfitzmann
Christoph Auer
Michele Dolfi
A. Nassar
Peter W. J. Staar
30
85
0
02 Jun 2022
Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness
Christoph Auer
Michele Dolfi
A. Carvalho
Cesar Berrospi Ramis
P. W. J. S. I. Research
27
9
0
01 Jun 2022
HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System
Bao-Sinh Nguyen
Q. Tran
Tuan-Anh Dang Nguyen
D. Nguyen
H. Le
37
0
0
01 Jun 2022
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining
Pengyuan Lyu
Chengquan Zhang
Shanshan Liu
Meina Qiao
Yangliu Xu
Liang Wu
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
42
42
0
01 Jun 2022
Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents
Nguyen Hong Son
Hieu M. Vu
Tuan-Anh Dang Nguyen
Minh Le Nguyen
43
5
0
26 May 2022
DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Jingnong Qu
Liunian Harold Li
Jieyu Zhao
Sunipa Dev
Kai-Wei Chang
21
12
0
25 May 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marccal Rusinol
O. R. Terrades
VLM
54
30
0
24 May 2022
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
Liangtai Sun
Xingyu Chen
Lu Chen
Tianle Dai
Zichen Zhu
Kai Yu
LLMAG
26
51
0
23 May 2022
MATrIX -- Modality-Aware Transformer for Information eXtraction
Thomas Delteil
Edouard Belval
Lei Chen
Luis Goncalves
Vijay Mahadevan
25
3
0
17 May 2022
LayoutXLM vs. GNN: An Empirical Evaluation of Relation Extraction for Documents
Hervé Déjean
S. Clinchant
Jean-Luc Meunier
30
4
0
09 May 2022
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
20
12
0
05 May 2022
Vision-Language Pre-Training for Boosting Scene Text Detectors
Sibo Song
Jianqiang Wan
Zhibo Yang
Jun Tang
Wenqing Cheng
Xiang Bai
Cong Yao
VLM
49
24
0
29 Apr 2022
Unified Pretraining Framework for Document Understanding
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
Nikolaos Barmpalios
R. Jain
A. Nenkova
Tong Sun
43
96
0
22 Apr 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
35
437
0
18 Apr 2022
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLM
ViT
37
73
0
30 Mar 2022
Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework
Zilong Wang
Jingbo Shang
36
10
0
30 Mar 2022
Visualizing the Relationship Between Encoded Linguistic Information and Task Performance
Jiannan Xiang
Huayang Li
Defu Lian
Guoping Huang
Taro Watanabe
Lemao Liu
42
0
0
29 Mar 2022
Multimodal Pre-training Based on Graph Attention Network for Document Understanding
Zhenrong Zhang
Jiefeng Ma
Jun Du
Licheng Wang
Jianshu Zhang
23
37
0
25 Mar 2022
Towards Structuring Real-World Data at Scale: Deep Learning for Extracting Key Oncology Information from Clinical Text with Patient-Level Supervision
Sam Preston
Mu-Hsin Wei
Rajesh Rao
Robert Tinn
Naoto Usuyama
...
Paul D. Tittel
Naveen Valluri
Tristan Naumann
Carlo Bifulco
Hoifung Poon
23
6
0
20 Mar 2022
HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information
Qianqian Ruan
Malte Ostendorff
Georg Rehm
AILaw
36
56
0
17 Mar 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Chen-Yu Lee
Chun-Liang Li
Timothy Dozat
Vincent Perot
Guolong Su
Nan Hua
Joshua Ainslie
Renshen Wang
Yasuhisa Fujii
Tomas Pfister
25
77
0
16 Mar 2022
XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding
Zhangxuan Gu
Changhua Meng
Ke Wang
Jun Lan
Weiqiang Wang
Ming Gu
Liqing Zhang
39
78
0
14 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
47
160
0
04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
Jiapeng Wang
Lianwen Jin
Kai Ding
VLM
35
140
0
28 Feb 2022
OCR-IDL: OCR Annotations for Industry Document Library Dataset
Ali Furkan Biten
Rubèn Pérez Tito
Lluís Gómez
Ernest Valveny
Dimosthenis Karatzas
27
26
0
25 Feb 2022
WebFormer: The Web-page Transformer for Structure Information Extraction
Qifan Wang
Yi Fang
Anirudh Ravula
Fuli Feng
Xiaojun Quan
Dongfang Liu
ViT
149
65
0
01 Feb 2022
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer
Sanket Biswas
Ayan Banerjee
Josep Lladós
Umapada Pal
ViT
21
23
0
27 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
65
64
0
24 Jan 2022
Cross-Domain Document Layout Analysis via Unsupervised Document Style Guide
Xingjiao Wu
Luwei Xiao
Xiangcheng Du
Yingbin Zheng
Xin Li
Tianlong Ma
Liangbo He
38
2
0
24 Jan 2022
Data-Efficient Information Extraction from Form-Like Documents
Beliz Gunel
Navneet Potti
Sandeep Tata
James Bradley Wendt
Marc Najork
Jing Xie
40
2
0
07 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
36
100
0
23 Dec 2021
LAME: Layout Aware Metadata Extraction Approach for Research Articles
Jonghyun Choi
Hyesoo Kong
Hwamook Yoon
Heung-Seon Oh
Yuchul Jung
33
3
0
23 Dec 2021
Previous
1
2
3
4
5
6
7
8
Next