Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments

Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments

18 May 2025

Aniket Bhattacharyya

Anurag Tripathi

Archan Karmakar

ArXiv (abs)PDF HTML

Papers citing "Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments"

12 / 12 papers shown

Title
DocLLM: A layout-aware generative language model for multimodal document understanding Dongsheng Wang Natraj Raman Mathieu Sibue Zhiqiang Ma Petr Babkin Simerjot Kaur Yulong Pei Armineh Nourbakhsh Xiaomo Liu VLM 73 58 0 31 Dec 2023
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang Tengchao Lv Lei Cui Yutong Lu Furu Wei 92 454 0 18 Apr 2022
DocFormer: End-to-End Transformer for Document Understanding Srikar Appalaraju Bhavan A. Jasani Bhargava Urala Kota Yusheng Xie R. Manmatha ViT 86 275 0 22 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models J. E. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang Weizhu Chen OffRL AI4TS AI4CE ALM AIMat 477 10,367 0 17 Jun 2021
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction Zheng Huang Kai Chen Jianhua He X. Bai Dimosthenis Karatzas Shijian Lu C. V. Jawahar 52 315 0 18 Mar 2021
DocVQA: A Dataset for VQA on Document Images Minesh Mathew Dimosthenis Karatzas C. V. Jawahar 142 739 0 01 Jul 2020
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 814 42,055 0 28 May 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu Minghao Li Lei Cui Shaohan Huang Furu Wei Ming Zhou 133 707 0 31 Dec 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 665 24,464 0 26 Jul 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 166 369 0 27 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 94,891 0 11 Oct 2018
Chargrid: Towards Understanding 2D Documents Anoop R. Katti C. Reisswig Cordula Guder Sebastian Brarda S. Bickel Johannes Höhne Jean Baptiste Faddoul 59 195 0 24 Sep 2018