ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.08518
  4. Cited By
MarkupLM: Pre-training of Text and Markup Language for Visually-rich
  Document Understanding

MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding

16 October 2021
Junlong Li
Yiheng Xu
Lei Cui
Furu Wei
    VLM
    3DGS
ArXivPDFHTML

Papers citing "MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding"

36 / 36 papers shown
Title
Multilingual Attribute Extraction from News Web Pages
Multilingual Attribute Extraction from News Web Pages
Pavel Bedrin
Maksim Varlamov
Alexander Yatskov
56
1
0
04 Feb 2025
A comprehensive study of on-device NLP applications -- VQA, automated
  Form filling, Smart Replies for Linguistic Codeswitching
A comprehensive study of on-device NLP applications -- VQA, automated Form filling, Smart Replies for Linguistic Codeswitching
Naman Goyal
21
0
0
23 Sep 2024
Understanding Privacy Norms through Web Forms
Understanding Privacy Norms through Web Forms
Hao Cui
R. Trimananda
A. Markopoulou
AILaw
22
0
0
29 Aug 2024
AutoFAIR : Automatic Data FAIRification via Machine Reading
AutoFAIR : Automatic Data FAIRification via Machine Reading
Tingyan Ma
Wei Liu
Bin Lu
Xiaoying Gan
Yunqiang Zhu
Luoyi Fu
Chenghu Zhou
19
0
0
07 Aug 2024
Deep Learning based Visually Rich Document Content Understanding: A
  Survey
Deep Learning based Visually Rich Document Content Understanding: A Survey
Muhammad Ali
Jean Lee
Salman Khan
44
6
0
02 Aug 2024
WebRPG: Automatic Web Rendering Parameters Generation for Visual
  Presentation
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
Zirui Shao
Feiyu Gao
Hangdi Xing
Zepeng Zhu
Zhi Yu
Jiajun Bu
Qi Zheng
Cong Yao
28
2
0
22 Jul 2024
Enhancing Mobile "How-to" Queries with Automated Search Results
  Verification and Reranking
Enhancing Mobile "How-to" Queries with Automated Search Results Verification and Reranking
Lei Ding
Jeshwanth Bheemanpally
Yi Zhang
34
1
0
13 Apr 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
67
12
0
05 Mar 2024
Hierarchical Multimodal Pre-training for Visually Rich Webpage
  Understanding
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding
Hongshen Xu
Lu Chen
Zihan Zhao
Da Ma
Ruisheng Cao
Zichen Zhu
Kai Yu
37
2
0
28 Feb 2024
Cleaner Pretraining Corpus Curation with Neural Web Scraping
Cleaner Pretraining Corpus Curation with Neural Web Scraping
Zhipeng Xu
Zhenghao Liu
Yukun Yan
Zhiyuan Liu
Ge Yu
Chenyan Xiong
CLIP
OnRL
21
4
0
22 Feb 2024
LAPDoc: Layout-Aware Prompting for Documents
LAPDoc: Layout-Aware Prompting for Documents
Marcel Lamott
Yves-Noel Weweler
A. Ulges
Faisal Shafait
Dirk Krechel
Darko Obradovic
51
5
0
15 Feb 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
32
59
0
08 Feb 2024
URL-BERT: Training Webpage Representations via Social Media Engagements
URL-BERT: Training Webpage Representations via Social Media Engagements
A. Qamar
Chetan Verma
Ahmed El-Kishky
Sumit Binnani
Sneha Mehta
Taylor Berg-Kirkpatrick
17
0
0
25 Oct 2023
Kosmos-2.5: A Multimodal Literate Model
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv
Yupan Huang
Jingye Chen
Lei Cui
Shuming Ma
...
Weiyao Luo
Shaoxiang Wu
Guoxin Wang
Cha Zhang
Furu Wei
VLM
MLLM
31
63
0
20 Sep 2023
A Real-World WebAgent with Planning, Long Context Understanding, and
  Program Synthesis
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
Izzeddin Gur
Hiroki Furuta
Austin Huang
Mustafa Safdari
Yutaka Matsuo
Douglas Eck
Aleksandra Faust
LM&Ro
LLMAG
39
198
0
24 Jul 2023
Multimodal Document Analytics for Banking Process Automation
Multimodal Document Analytics for Banking Process Automation
C. Gerling
Stefan Lessmann
30
3
0
21 Jul 2023
DocFormerv2: Local Features for Document Understanding
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
28
39
0
02 Jun 2023
Schema-Driven Information Extraction from Heterogeneous Tables
Schema-Driven Information Extraction from Heterogeneous Tables
Fan Bai
Junmo Kang
Gabriel Stanovsky
Dayne Freitag
Alan Ritter
LMTD
15
10
0
23 May 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal
  Approach with Relative XML Path
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path
Zilong Wang
Jingbo Shang
49
0
0
23 May 2023
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta
Kuang-Huei Lee
Ofir Nachum
Yutaka Matsuo
Aleksandra Faust
S. Gu
Izzeddin Gur
LM&Ro
36
92
0
19 May 2023
PLM-GNN: A Webpage Classification Method based on Joint Pre-trained
  Language Model and Graph Neural Network
PLM-GNN: A Webpage Classification Method based on Joint Pre-trained Language Model and Graph Neural Network
Qiwei Lang
Jing Zhou
Haoyi Wang
Shiqi Lyu
Rui Zhang
79
2
0
09 May 2023
An Inclusive Notion of Text
An Inclusive Notion of Text
Ilia Kuznetsov
Iryna Gurevych
22
0
0
10 Nov 2022
FormLM: Recommending Creation Ideas for Online Forms by Modelling
  Semantic and Structural Information
FormLM: Recommending Creation Ideas for Online Forms by Modelling Semantic and Structural Information
Yijia Shao
Mengyu Zhou
Yifan Zhong
Tao Wu
Hongwei Han
Shi Han
Gideon Huang
Dongmei Zhang
3DV
17
2
0
10 Nov 2022
On Web-based Visual Corpus Construction for Visual Document
  Understanding
On Web-based Visual Corpus Construction for Visual Document Understanding
Donghyun Kim
Teakgyu Hong
Moonbin Yim
Yoonsik Kim
Geewook Kim
34
3
0
07 Nov 2022
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
  Understanding
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
166
263
0
07 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
50
13
0
06 Oct 2022
GROWN+UP: A Graph Representation Of a Webpage Network Utilizing
  Pre-training
GROWN+UP: A Graph Representation Of a Webpage Network Utilizing Pre-training
Benedict Yeoh
Huijuan Wang
GNN
31
1
0
03 Aug 2022
PLAtE: A Large-scale Dataset for List Page Web Extraction
PLAtE: A Large-scale Dataset for List Page Web Extraction
Aidan San
Yuan Zhuang
J. Bakus
Colin Lockard
David M. Ciemiewicz
Sandeep Atluri
Yangfeng Ji
Kevin Small
Heba Elfardy
27
4
0
24 May 2022
TIE: Topological Information Enhanced Structural Reading Comprehension
  on Web Pages
TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
Zihan Zhao
Lu Chen
Ruisheng Cao
Hongshen Xu
Xingyu Chen
Kai Yu
36
9
0
13 May 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks
  with Unified Vision-and-Language BERTs
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
16
6
0
15 Mar 2022
DOM-LM: Learning Generalizable Representations for HTML Documents
DOM-LM: Learning Generalizable Representations for HTML Documents
Xiang Deng
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Huan Sun
AI4TS
AI4CE
42
37
0
25 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training
  Objectives, and Downstream Tasks
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
65
64
0
24 Jan 2022
Document AI: Benchmarks, Models and Applications
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
21
69
0
16 Nov 2021
Simplified DOM Trees for Transferable Attribute Extraction from the Web
Simplified DOM Trees for Transferable Attribute Extraction from the Web
Yichao Zhou
Ying Sheng
N. Vo
Nick Edmonds
Sandeep Tata
124
28
0
07 Jan 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document
  Understanding
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
153
498
0
29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
134
355
0
27 May 2019
1