MGDoc: Pre-training with Multi-granular Hierarchy for Document Image
Understanding

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

27 November 2022

Jiuxiang Gu

Chris Tensmeyer

Nikolaos Barmpalios

Vlad I. Morariu

ArXiv (abs)PDF HTML

Papers citing "MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding"

18 / 18 papers shown

Title
Unified Pretraining Framework for Document Understanding Jiuxiang Gu Jason Kuen Vlad I. Morariu Handong Zhao Nikolaos Barmpalios R. Jain A. Nenkova Tong Sun 73 97 0 22 Apr 2022
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Teakgyu Hong Donghyun Kim Mingi Ji Wonseok Hwang Daehyun Nam Sungrae Park VLM 73 153 0 10 Aug 2021
DocFormer: End-to-End Transformer for Document Understanding Srikar Appalaraju Bhavan A. Jasani Bhargava Urala Kota Yusheng Xie R. Manmatha ViT 88 279 0 22 Jun 2021
SelfDoc: Self-Supervised Document Representation Learning Peizhao Li Jiuxiang Gu Jason Kuen Vlad I. Morariu Handong Zhao R. Jain Varun Manjunatha Hongfu Liu ViT SSL 75 161 0 07 Jun 2021
Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer Rafal Powalski Łukasz Borchmann Dawid Jurkiewicz Tomasz Dwojak Michal Pietruszka Gabriela Pałka ViT 66 157 0 18 Feb 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu Yiheng Xu Tengchao Lv Lei Cui Furu Wei ... D. Florêncio Cha Zhang Wanxiang Che Min Zhang Lidong Zhou ViT MLLM 192 514 0 29 Dec 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction Lukasz Garncarek Rafal Powalski Tomasz Stanislawek Bartosz Topolski Piotr Halama M. Turski Filip Graliñski 49 88 0 19 Feb 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu Minghao Li Lei Cui Shaohan Huang Furu Wei Ming Zhou 135 707 0 31 Dec 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Colin Raffel Noam M. Shazeer Adam Roberts Katherine Lee Sharan Narang Michael Matena Yanqi Zhou Wei Li Peter J. Liu AIMat 445 20,298 0 23 Oct 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Nils Reimers Iryna Gurevych 1.3K 12,295 0 27 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks Jiasen Lu Dhruv Batra Devi Parikh Stefan Lee SSL VLM 231 3,693 0 06 Aug 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 168 369 0 27 May 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Zihang Dai Zhilin Yang Yiming Yang J. Carbonell Quoc V. Le Ruslan Salakhutdinov VLM 250 3,730 0 09 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,114 0 11 Oct 2018
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 719 132,199 0 12 Jun 2017
Aggregated Residual Transformations for Deep Neural Networks Saining Xie Ross B. Girshick Piotr Dollár Zhuowen Tu Kaiming He 522 10,345 0 16 Nov 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,322 0 10 Dec 2015
Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval Adam W. Harley Alex Ufkes Konstantinos G. Derpanis 101 397 0 25 Feb 2015