Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.19672
Cited By
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding
29 September 2024
Chong Zhang
Yi Tu
Yixi Zhao
Chenshu Yuan
Huan Chen
Yue Zhang
Mingxu Chai
Ya Guo
Huijia Zhu
Qi Zhang
Tao Gui
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding"
26 / 26 papers shown
Title
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions
Ryota Tanaka
Taichi Iki
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
57
23
0
24 Jan 2024
DocLLM: A layout-aware generative language model for multimodal document understanding
Dongsheng Wang
Natraj Raman
Mathieu Sibue
Zhiqiang Ma
Petr Babkin
Simerjot Kaur
Yulong Pei
Armineh Nourbakhsh
Xiaomo Liu
VLM
73
58
0
31 Dec 2023
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
Qiwei Li
Z. Li
Xiantao Cai
Bo Du
Hai Zhao
49
8
0
15 Aug 2023
GVdoc: Graph-based Visual Document Classification
Fnu Mohbat
Mohammed J Zaki
Catherine Finegan-Dollak
Ashish Verma
OOD
58
1
0
26 May 2023
Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective
Tianyu Liu
Afra Amini
Mrinmaya Sachan
Ryan Cotterell
79
2
0
24 May 2023
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation
Prashant Krishnan
Zilong Wang
Yangkun Wang
Jingbo Shang
48
3
0
24 May 2023
Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang
Ziyi Yang
Guoxin Wang
Yuwei Fang
Yang Liu
Chenguang Zhu
Michael Zeng
Chao-Yue Zhang
Joey Tianyi Zhou
VLM
82
112
0
05 Dec 2022
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Qiming Peng
Yinxu Pan
Wenjin Wang
Bin Luo
Zhenyu Zhang
...
Shi Feng
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
64
83
0
12 Oct 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Song Tao
Zijian Wang
Tiantian Fan
Canjie Luo
Can Huang
SSL
66
2
0
28 Jul 2022
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding
Chuwei Luo
Guozhi Tang
Qi Zheng
Cong Yao
Lianwen Jin
Chenliang Li
Yang Xue
Luo Si
63
18
0
27 Jun 2022
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
106
12
0
05 May 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
92
454
0
18 Apr 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Chen-Yu Lee
Chun-Liang Li
Timothy Dozat
Vincent Perot
Guolong Su
Nan Hua
Joshua Ainslie
Renshen Wang
Yasuhisa Fujii
Tomas Pfister
73
78
0
16 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
Jiapeng Wang
Lianwen Jin
Kai Ding
VLM
67
142
0
28 Feb 2022
Entity Relation Extraction as Dependency Parsing in Visually Rich Documents
Yue Zhang
Bo Zhang
Rui Wang
Junjie Cao
Chen Li
Zuyi Bao
78
32
0
19 Oct 2021
LayoutReader: Pre-training of Text and Layout for Reading Order Detection
Zilong Wang
Yiheng Xu
Lei Cui
Jingbo Shang
Furu Wei
59
76
0
26 Aug 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong
Donghyun Kim
Mingi Ji
Wonseok Hwang
Daehyun Nam
Sungrae Park
VLM
73
153
0
10 Aug 2021
StrucTexT: Structured Text Understanding with Multi-Modal Transformers
Yulin Li
Yuxi Qian
Yuchen Yu
Xiameng Qin
Chengquan Zhang
Yan Liu
Kun Yao
Junyu Han
Jingtuo Liu
Errui Ding
75
117
0
06 Aug 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
86
275
0
22 Jun 2021
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Yiheng Xu
Tengchao Lv
Lei Cui
Guoxin Wang
Yijuan Lu
D. Florêncio
Cha Zhang
Furu Wei
MLLM
VLM
72
130
0
18 Apr 2021
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction
Zheng Huang
Kai Chen
Jianhua He
X. Bai
Dimosthenis Karatzas
Shijian Lu
C. V. Jawahar
52
315
0
18 Mar 2021
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
Rafal Powalski
Łukasz Borchmann
Dawid Jurkiewicz
Tomasz Dwojak
Michal Pietruszka
Gabriela Pałka
ViT
66
157
0
18 Feb 2021
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
142
739
0
01 Jul 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
163
369
0
27 May 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
85
1,244
0
18 Apr 2019
Graph Convolution for Multimodal Information Extraction from Visually Rich Documents
Xiaojing Liu
Feiyu Gao
Qiong Zhang
Huasha Zhao
63
183
0
27 Mar 2019
1