v1v2 (latest)

DocFormer: End-to-End Transformer for Document Understanding

22 June 2021

Bhargava Urala Kota

Papers citing "DocFormer: End-to-End Transformer for Document Understanding"

35 / 185 papers shown

Title
Understanding Long Documents with Different Position-Aware Attentions Hai Pham Guoxin Wang Yijuan Lu D. Florêncio Changrong Zhang 67 9 0 17 Aug 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding Song Tao Zijian Wang Tiantian Fan Canjie Luo Can Huang SSL 80 2 0 28 Jul 2022
Towards Complex Document Understanding By Discrete Reasoning Fengbin Zhu Wenqiang Lei Fuli Feng Chao Wang Haozhou Zhang Tat-Seng Chua 109 48 0 25 Jul 2022
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration Zhenyu Zhang Yu Bowen Haiyang Yu Tingwen Liu Cheng Fu Jingyang Li Chengguang Tang Jian Sun Yongbin Li 92 5 0 14 Jul 2022
GMN: Generative Multi-modal Network for Practical Document Information Extraction H. Cao Jiefeng Ma Antai Guo Yiqing Hu Hao Liu Deqiang Jiang Yinsong Liu Bo Ren 44 8 0 11 Jul 2022
Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding Chuwei Luo Guozhi Tang Qi Zheng Cong Yao Lianwen Jin Chenliang Li Yang Xue Luo Si 91 18 0 27 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge Linxi Fan Guanzhi Wang Yunfan Jiang Ajay Mandlekar Yuncong Yang Haoyi Zhu Andrew Tang De-An Huang Yuke Zhu Anima Anandkumar LM&Ro 150 388 0 17 Jun 2022
MixGen: A New Multi-Modal Data Augmentation Xiaoshuai Hao Yi Zhu Srikar Appalaraju Aston Zhang Wanqian Zhang Boyang Li Mu Li VLM 113 90 0 16 Jun 2022
Test-Time Adaptation for Visual Document Understanding Sayna Ebrahimi Sercan O. Arik Tomas Pfister OOD 76 6 0 15 Jun 2022
RDU: A Region-based Approach to Form-style Document Understanding Fengbin Zhu Chao Wang Wenqiang Lei Ziyang Liu Tat-Seng Chua 55 2 0 14 Jun 2022
Multimodal Learning with Transformers: A Survey Peng Xu Xiatian Zhu David Clifton ViT 236 575 0 13 Jun 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification Souhail Bakkali Zuheng Ming Mickael Coustaty Marccal Rusinol O. R. Terrades VLM 99 30 0 24 May 2022
MATrIX -- Modality-Aware Transformer for Information eXtraction Thomas Delteil Edouard Belval Lei Chen Luis Goncalves Vijay Mahadevan 89 3 0 17 May 2022
Relational Representation Learning in Visually-Rich Documents Xin Li Yan Zheng Yiqing Hu H. Cao Yunfei Wu Deqiang Jiang Yinsong Liu Bo Ren 111 12 0 05 May 2022
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang Tengchao Lv Lei Cui Yutong Lu Furu Wei 129 464 0 18 Apr 2022
End-to-end Document Recognition and Understanding with Dessurt Brian L. Davis B. Morse Brian L. Price Chris Tensmeyer Curtis Wigington Vlad I. Morariu VLM ViT 121 73 0 30 Mar 2022
Towards End-to-End Unified Scene Text Detection and Layout Analysis Shangbang Long Siyang Qin Dmitry Panteleev Alessandro Bissacco Yasuhisa Fujii Michalis Raptis 97 97 0 28 Mar 2022
Multimodal Pre-training Based on Graph Attention Network for Document Understanding Zhenrong Zhang Jiefeng Ma Jun Du Licheng Wang Jianshu Zhang 55 38 0 25 Mar 2022
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction Chen-Yu Lee Chun-Liang Li Timothy Dozat Vincent Perot Guolong Su Nan Hua Joshua Ainslie Renshen Wang Yasuhisa Fujii Tomas Pfister 96 79 0 16 Mar 2022
XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding Zhangxuan Gu Changhua Meng Ke Wang Jun Lan Weiqiang Wang Ming Gu Liqing Zhang 101 79 0 14 Mar 2022
Image Search with Text Feedback by Additive Attention Compositional Learning Yuxin Tian Shawn D. Newsam K. Boakye CoGe 70 13 0 08 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer Junlong Li Yiheng Xu Tengchao Lv Lei Cui Chaoxi Zhang Furu Wei ViT VLM 126 170 0 04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding Jiapeng Wang Lianwen Jin Kai Ding VLM 76 143 0 28 Feb 2022
OCR-IDL: OCR Annotations for Industry Document Library Dataset Ali Furkan Biten Rubèn Pérez Tito Lluís Gómez Ernest Valveny Dimosthenis Karatzas 72 30 0 25 Feb 2022
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility Andrea Burns Deniz Arsan Sanjna Agrawal Ranjitha Kumar Kate Saenko Bryan A. Plummer 120 65 0 04 Feb 2022
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer Sanket Biswas Ayan Banerjee Josep Lladós Umapada Pal ViT 110 23 0 27 Jan 2022
DocEnTr: An End-to-End Document Image Enhancement Transformer Mohamed Ali Souibgui Sanket Biswas Sana Khamekhem Jemni Yousri Kessentini Alicia Fornés Josep Lladós Umapada Pal ViT 121 47 0 25 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks Haoyu Dong Zhoujun Cheng Xinyi He Mengyuan Zhou Anda Zhou Fan Zhou Ao Liu Shi Han Dongmei Zhang LMTD 151 65 0 24 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA Ali Furkan Biten Ron Litman Yusheng Xie Srikar Appalaraju R. Manmatha ViT 123 102 0 23 Dec 2021
Value Retrieval with Arbitrary Queries for Form-like Documents M. Gao Le Xue Chetan Ramaiah Chen Xing Ran Xu Caiming Xiong 161 6 0 15 Dec 2021
OCR-free Document Understanding Transformer Geewook Kim Teakgyu Hong Moonbin Yim Jeongyeon Nam Jinyoung Park Jinyeong Yim Wonseok Hwang Sangdoo Yun Dongyoon Han Seunghyun Park ViT 148 274 0 30 Nov 2021
Document AI: Benchmarks, Models and Applications Lei Cui Yiheng Xu Tengchao Lv Furu Wei VLM 89 74 0 16 Nov 2021
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding Junlong Li Yiheng Xu Lei Cui Furu Wei VLM 3DGS 101 65 0 16 Oct 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Teakgyu Hong Donghyun Kim Mingi Ji Wonseok Hwang Daehyun Nam Sungrae Park VLM 124 154 0 10 Aug 2021
InfographicVQA Minesh Mathew Viraj Bagal Rubèn Pérez Tito Dimosthenis Karatzas Ernest Valveny C. V. Jawahar 110 242 0 26 Apr 2021