ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.13318
  4. Cited By
LayoutLM: Pre-training of Text and Layout for Document Image
  Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

31 December 2019
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
ArXivPDFHTML

Papers citing "LayoutLM: Pre-training of Text and Layout for Document Image Understanding"

50 / 371 papers shown
Title
Large Scale Genealogical Information Extraction From Handwritten Quebec
  Parish Records
Large Scale Genealogical Information Extraction From Handwritten Quebec Parish Records
Solène Tarride
Martin Maarand
Mélodie Boillet
James McGrath
Eugénie Capel
H. Vézina
Christopher Kermorvant
30
10
0
27 Apr 2023
DualSlide: Global-to-Local Sketching Interface for Slide Content and
  Layout Design
DualSlide: Global-to-Local Sketching Interface for Slide Content and Layout Design
Jiahao Weng
Xu Du
Haoran Xie
30
1
0
25 Apr 2023
DocParser: End-to-end OCR-free Information Extraction from Visually Rich
  Documents
DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents
M. Dhouib
G. Bettaieb
A. Shabou
27
21
0
24 Apr 2023
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis
PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis
Shuyong Wei
Nuo Xu
27
5
0
24 Apr 2023
Information Extraction from Documents: Question Answering vs Token
  Classification in real-world setups
Information Extraction from Documents: Question Answering vs Token Classification in real-world setups
Laurent Lam
Pirashanth Ratnamogan
Joel Tang
William Vanhuffel
Fabien Caspani
24
0
0
21 Apr 2023
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo
Changxu Cheng
Qi Zheng
Cong Yao
35
44
0
21 Apr 2023
MPMQA: Multimodal Question Answering on Product Manuals
MPMQA: Multimodal Question Answering on Product Manuals
Liangfu Zhang
Anwen Hu
Jing Zhang
Shuo Hu
Qin Jin
30
9
0
19 Apr 2023
A Question-Answering Approach to Key Value Pair Extraction from
  Form-like Document Images
A Question-Answering Approach to Key Value Pair Extraction from Form-like Document Images
Kai Hu
Zhuoyuan Wu
Zhuoyao Zhong
Weihong Lin
Lei-huan Sun
Qiang Huo
26
11
0
17 Apr 2023
Expressive Text-to-Image Generation with Rich Text
Expressive Text-to-Image Generation with Rich Text
Songwei Ge
Taesung Park
Jun-Yan Zhu
Jia-Bin Huang
DiffM
79
79
0
13 Apr 2023
PDFVQA: A New Dataset for Real-World VQA on PDF Documents
PDFVQA: A New Dataset for Real-World VQA on PDF Documents
Yihao Ding
Siwen Luo
Hyunsuk Chung
S. Han
33
17
0
13 Apr 2023
ChartReader: A Unified Framework for Chart Derendering and Comprehension
  without Heuristic Rules
ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Jingdong Sun
Teruko Mitamura
Alexander G. Hauptmann
29
21
0
05 Apr 2023
Locate Then Generate: Bridging Vision and Language with Bounding Box for
  Scene-Text VQA
Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA
Yongxin Zhu
Zichen Liu
Yukang Liang
Xin Li
Hao Liu
Changcun Bao
Linli Xu
26
6
0
04 Apr 2023
Towards Flexible Multi-modal Document Models
Towards Flexible Multi-modal Document Models
Naoto Inoue
Kotaro Kikuchi
E. Simo-Serra
Mayu Otani
Kota Yamaguchi
42
20
0
31 Mar 2023
The Semantic Reader Project: Augmenting Scholarly Documents through
  AI-Powered Interactive Reading Interfaces
The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces
Kyle Lo
Joseph Chee Chang
Andrew Head
Jonathan Bragg
Amy X. Zhang
...
Caroline M Wu
Jiangjiang Yang
Angele Zamarron
Marti A. Hearst
Daniel S. Weld
34
19
0
25 Mar 2023
Modeling Entities as Semantic Points for Visual Information Extraction
  in the Wild
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang
Rujiao Long
Pengfei Wang
Sibo Song
Humen Zhong
Wenqing Cheng
X. Bai
Cong Yao
41
22
0
23 Mar 2023
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
  Document Information Extraction
ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction
Jiabang He
Lei Wang
Yingpeng Hu
Ning Liu
Hui-juan Liu
Xingdong Xu
Hengtao Shen
MLLM
6
46
0
09 Mar 2023
LORE: Logical Location Regression Network for Table Structure
  Recognition
LORE: Logical Location Regression Network for Table Structure Recognition
Hangdi Xing
Feiyu Gao
Rujiao Long
Jiajun Bu
Qi Zheng
Liangcheng Li
Cong Yao
Zhi Yu
LMTD
37
19
0
07 Mar 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image
  Pre-training
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Yu Yu
Yulin Li
Chengquan Zhang
Xiaoqiang Zhang
Zengyuan Guo
Xiameng Qin
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
24
45
0
01 Mar 2023
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from
  Diagram
A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram
Ming-Liang Zhang
Fei Yin
Cheng-Lin Liu
AI4CE
58
41
0
22 Feb 2023
Optimising Human-Machine Collaboration for Efficient High-Precision
  Information Extraction from Text Documents
Optimising Human-Machine Collaboration for Efficient High-Precision Information Extraction from Text Documents
Bradley Butcher
Miri Zilka
Darren Cook
Jiri Hron
Adrian Weller
38
3
0
18 Feb 2023
Entry Separation using a Mixed Visual and Textual Language Model:
  Application to 19th century French Trade Directories
Entry Separation using a Mixed Visual and Textual Language Model: Application to 19th century French Trade Directories
Bertrand Duménieu
Edwin Carlinet
N. Abadie
Joseph Chazalon
29
0
0
17 Feb 2023
DocILE Benchmark for Document Information Localization and Extraction
DocILE Benchmark for Document Information Localization and Extraction
vStvepán vSimsa
Milan vSulc
Michal Uvrivcávr
Yash J. Patel
Ahmed Hamdi
...
Matyávs Skalický
Jivrí Matas
Antoine Doucet
Mickael Coustaty
Dimosthenis Karatzas
24
34
0
11 Feb 2023
CTE: A Dataset for Contextualized Table Extraction
CTE: A Dataset for Contextualized Table Extraction
Andrea Gemelli
Emanuele Vivoli
S. Marinai
LMTD
13
2
0
02 Feb 2023
Evaluating TCFD Reporting: A New Application of Zero-Shot Analysis to
  Climate-Related Financial Disclosures
Evaluating TCFD Reporting: A New Application of Zero-Shot Analysis to Climate-Related Financial Disclosures
Alix Auzepy
Elena Tönjes
David Lenz
C. Funk
33
5
0
01 Feb 2023
Layout-aware Webpage Quality Assessment
Layout-aware Webpage Quality Assessment
Anfeng Cheng
Yiding Liu
Weibin Li
Qian Dong
Shuaiqiang Wang
Zhengjie Huang
Shikun Feng
Zhicong Cheng
Dawei Yin
3DV
35
4
0
28 Jan 2023
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and
  Layout-Aware Summarization
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
Laura Nguyen
Thomas Scialom
Benjamin Piwowarski
Jacopo Staiano
27
7
0
26 Jan 2023
Towards Models that Can See and Read
Towards Models that Can See and Read
Roy Ganz
Oren Nuriel
Aviad Aberdam
Yair Kittenplon
Shai Mazor
Ron Litman
24
13
0
18 Jan 2023
SlideVQA: A Dataset for Document Visual Question Answering on Multiple
  Images
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images
Ryota Tanaka
Kyosuke Nishida
Kosuke Nishida
Taku Hasegawa
Itsumi Saito
Kuniko Saito
25
74
0
12 Jan 2023
MessageNet: Message Classification using Natural Language Processing and
  Meta-data
MessageNet: Message Classification using Natural Language Processing and Meta-data
Adar Kahana
Oren Elisha
9
0
0
04 Jan 2023
Interactive Layout Drawing Interface with Shadow Guidance
Interactive Layout Drawing Interface with Shadow Guidance
Jiahao Weng
Haoran Xie
3DV
13
2
0
26 Dec 2022
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and
  Chart Derendering
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
Fangyu Liu
Francesco Piccinno
Syrine Krichene
Chenxi Pang
Kenton Lee
Mandar Joshi
Yasemin Altun
Nigel Collier
Julian Martin Eisenschlos
VLM
LRM
19
89
0
19 Dec 2022
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document
  Understanding
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
Haoli Bai
Zhiguang Liu
Xiaojun Meng
Wentao Li
Shuangning Liu
...
Liangwei Wang
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
ViT
35
13
0
19 Dec 2022
Hierarchical multimodal transformers for Multi-Page DocVQA
Hierarchical multimodal transformers for Multi-Page DocVQA
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
13
56
0
07 Dec 2022
Multimodal Tree Decoder for Table of Contents Extraction in Document
  Images
Multimodal Tree Decoder for Table of Contents Extraction in Document Images
Pengfei Hu
Zhenrong Zhang
Jianshu Zhang
Jun Du
Jiajia Wu
25
12
0
06 Dec 2022
Unifying Vision, Text, and Layout for Universal Document Processing
Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang
Ziyi Yang
Guoxin Wang
Yuwei Fang
Yang Liu
Chenguang Zhu
Michael Zeng
Chao-Yue Zhang
Joey Tianyi Zhou
VLM
32
107
0
05 Dec 2022
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image
  Understanding
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding
Zilong Wang
Jiuxiang Gu
Chris Tensmeyer
Nikolaos Barmpalios
A. Nenkova
Tong Sun
Jingbo Shang
Vlad I. Morariu
VLM
25
12
0
27 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image
  Models
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
41
2
0
27 Nov 2022
VRDU: A Benchmark for Visually-rich Document Understanding
VRDU: A Benchmark for Visually-rich Document Understanding
Zilong Wang
Yichao Zhou
Wei Wei
Chen-Yu Lee
Sandeep Tata
30
15
0
15 Nov 2022
QueryForm: A Simple Zero-shot Form Entity Query Framework
QueryForm: A Simple Zero-shot Form Entity Query Framework
Zifeng Wang
Zizhao Zhang
Jacob Devlin
Chen-Yu Lee
Guolong Su
Hao Zhang
Jennifer Dy
Vincent Perot
Tomas Pfister
27
8
0
14 Nov 2022
Unimodal and Multimodal Representation Training for Relation Extraction
Unimodal and Multimodal Representation Training for Relation Extraction
Ciaran Cooney
Rachel Heyburn
Liam Maddigan
Mairead O'Cuinn
Chloe Thompson
Joana Cavadas
36
2
0
11 Nov 2022
On Web-based Visual Corpus Construction for Visual Document
  Understanding
On Web-based Visual Corpus Construction for Visual Document Understanding
Donghyun Kim
Teakgyu Hong
Moonbin Yim
Yoonsik Kim
Geewook Kim
39
4
0
07 Nov 2022
Radically Lower Data-Labeling Costs for Visually Rich Document
  Extraction Models
Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models
Yichao Zhou
James Bradley Wendt
Navneet Potti
Jing Xie
Sandeep Tata
VLM
32
1
0
28 Oct 2022
ReSel: N-ary Relation Extraction from Scientific Text and Tables by
  Learning to Retrieve and Select
ReSel: N-ary Relation Extraction from Scientific Text and Tables by Learning to Retrieve and Select
Yuchen Zhuang
Yinghao Li
Jerry Junyang Cheung
Yue Yu
Yingjun Mou
Xinyu Chen
Le Song
Chao Zhang
29
19
0
26 Oct 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson
Gordon Lim
Yutong Ai
David Kuang
Kevin Leach
OODD
OOD
37
18
0
14 Oct 2022
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich
  Document Understanding
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Qiming Peng
Yinxu Pan
Wenjin Wang
Bin Luo
Zhenyu Zhang
...
Shi Feng
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
13
83
0
12 Oct 2022
PP-StructureV2: A Stronger Document Analysis System
PP-StructureV2: A Stronger Document Analysis System
Chenxia Li
Ruoyu Guo
Jun Zhou
Mengtao An
Yuning Du
Lingfeng Zhu
Yi Liu
Xiaoguang Hu
Dianhai Yu
57
22
0
11 Oct 2022
Key Information Extraction in Purchase Documents using Deep Learning and
  Rule-based Corrections
Key Information Extraction in Purchase Documents using Deep Learning and Rule-based Corrections
R. Arroyo
J. Yebes
E. Martínez
Hector Corrales
Javier Lorenzo
33
1
0
07 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
52
13
0
06 Oct 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document
  Understanding
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang
Zhengjie Huang
Bin Luo
Qianglong Chen
Qiming Peng
...
Weichong Yin
Shi Feng
Yu Sun
Dianhai Yu
Yin Zhang
ViT
35
11
0
18 Sep 2022
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text
Abhinav Java
Shripad Deshmukh
Milan Aggarwal
Surgan Jandial
Mausoom Sarkar
Balaji Krishnamurthy
32
3
0
12 Sep 2022
Previous
12345678
Next