Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.11539
Cited By
v1
v2 (latest)
DocFormer: End-to-End Transformer for Document Understanding
22 June 2021
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DocFormer: End-to-End Transformer for Document Understanding"
35 / 185 papers shown
Title
Understanding Long Documents with Different Position-Aware Attentions
Hai Pham
Guoxin Wang
Yijuan Lu
D. Florêncio
Changrong Zhang
67
9
0
17 Aug 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Song Tao
Zijian Wang
Tiantian Fan
Canjie Luo
Can Huang
SSL
80
2
0
28 Jul 2022
Towards Complex Document Understanding By Discrete Reasoning
Fengbin Zhu
Wenqiang Lei
Fuli Feng
Chao Wang
Haozhou Zhang
Tat-Seng Chua
109
48
0
25 Jul 2022
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration
Zhenyu Zhang
Yu Bowen
Haiyang Yu
Tingwen Liu
Cheng Fu
Jingyang Li
Chengguang Tang
Jian Sun
Yongbin Li
92
5
0
14 Jul 2022
GMN: Generative Multi-modal Network for Practical Document Information Extraction
H. Cao
Jiefeng Ma
Antai Guo
Yiqing Hu
Hao Liu
Deqiang Jiang
Yinsong Liu
Bo Ren
44
8
0
11 Jul 2022
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding
Chuwei Luo
Guozhi Tang
Qi Zheng
Cong Yao
Lianwen Jin
Chenliang Li
Yang Xue
Luo Si
91
18
0
27 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
150
388
0
17 Jun 2022
MixGen: A New Multi-Modal Data Augmentation
Xiaoshuai Hao
Yi Zhu
Srikar Appalaraju
Aston Zhang
Wanqian Zhang
Boyang Li
Mu Li
VLM
113
90
0
16 Jun 2022
Test-Time Adaptation for Visual Document Understanding
Sayna Ebrahimi
Sercan O. Arik
Tomas Pfister
OOD
76
6
0
15 Jun 2022
RDU: A Region-based Approach to Form-style Document Understanding
Fengbin Zhu
Chao Wang
Wenqiang Lei
Ziyang Liu
Tat-Seng Chua
55
2
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
236
575
0
13 Jun 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marccal Rusinol
O. R. Terrades
VLM
99
30
0
24 May 2022
MATrIX -- Modality-Aware Transformer for Information eXtraction
Thomas Delteil
Edouard Belval
Lei Chen
Luis Goncalves
Vijay Mahadevan
89
3
0
17 May 2022
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
111
12
0
05 May 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
129
464
0
18 Apr 2022
End-to-end Document Recognition and Understanding with Dessurt
Brian L. Davis
B. Morse
Brian L. Price
Chris Tensmeyer
Curtis Wigington
Vlad I. Morariu
VLM
ViT
121
73
0
30 Mar 2022
Towards End-to-End Unified Scene Text Detection and Layout Analysis
Shangbang Long
Siyang Qin
Dmitry Panteleev
Alessandro Bissacco
Yasuhisa Fujii
Michalis Raptis
97
97
0
28 Mar 2022
Multimodal Pre-training Based on Graph Attention Network for Document Understanding
Zhenrong Zhang
Jiefeng Ma
Jun Du
Licheng Wang
Jianshu Zhang
55
38
0
25 Mar 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction
Chen-Yu Lee
Chun-Liang Li
Timothy Dozat
Vincent Perot
Guolong Su
Nan Hua
Joshua Ainslie
Renshen Wang
Yasuhisa Fujii
Tomas Pfister
96
79
0
16 Mar 2022
XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding
Zhangxuan Gu
Changhua Meng
Ke Wang
Jun Lan
Weiqiang Wang
Ming Gu
Liqing Zhang
101
79
0
14 Mar 2022
Image Search with Text Feedback by Additive Attention Compositional Learning
Yuxin Tian
Shawn D. Newsam
K. Boakye
CoGe
70
13
0
08 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
126
170
0
04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
Jiapeng Wang
Lianwen Jin
Kai Ding
VLM
76
143
0
28 Feb 2022
OCR-IDL: OCR Annotations for Industry Document Library Dataset
Ali Furkan Biten
Rubèn Pérez Tito
Lluís Gómez
Ernest Valveny
Dimosthenis Karatzas
72
30
0
25 Feb 2022
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
120
65
0
04 Feb 2022
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer
Sanket Biswas
Ayan Banerjee
Josep Lladós
Umapada Pal
ViT
110
23
0
27 Jan 2022
DocEnTr: An End-to-End Document Image Enhancement Transformer
Mohamed Ali Souibgui
Sanket Biswas
Sana Khamekhem Jemni
Yousri Kessentini
Alicia Fornés
Josep Lladós
Umapada Pal
ViT
121
47
0
25 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
151
65
0
24 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
123
102
0
23 Dec 2021
Value Retrieval with Arbitrary Queries for Form-like Documents
M. Gao
Le Xue
Chetan Ramaiah
Chen Xing
Ran Xu
Caiming Xiong
161
6
0
15 Dec 2021
OCR-free Document Understanding Transformer
Geewook Kim
Teakgyu Hong
Moonbin Yim
Jeongyeon Nam
Jinyoung Park
Jinyeong Yim
Wonseok Hwang
Sangdoo Yun
Dongyoon Han
Seunghyun Park
ViT
148
274
0
30 Nov 2021
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
89
74
0
16 Nov 2021
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding
Junlong Li
Yiheng Xu
Lei Cui
Furu Wei
VLM
3DGS
101
65
0
16 Oct 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong
Donghyun Kim
Mingi Ji
Wonseok Hwang
Daehyun Nam
Sungrae Park
VLM
124
154
0
10 Aug 2021
InfographicVQA
Minesh Mathew
Viraj Bagal
Rubèn Pérez Tito
Dimosthenis Karatzas
Ernest Valveny
C. V. Jawahar
110
242
0
26 Apr 2021
Previous
1
2
3
4