Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.16356
Cited By
A Multi-Modal Multilingual Benchmark for Document Image Classification
25 October 2023
Yoshinari Fujinuma
Siddharth Varia
Nishant Sankaran
Srikar Appalaraju
Bonan Min
Yogarshi Vyas
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Multi-Modal Multilingual Benchmark for Document Image Classification"
19 / 19 papers shown
Title
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
81
3
0
17 Jul 2024
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
87
40
0
02 Jun 2023
An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text
Yova Kementchedjhieva
Ilias Chalkidis
87
24
0
09 May 2023
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
95
458
0
18 Apr 2022
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
100
166
0
04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
Jiapeng Wang
Lianwen Jin
Kai Ding
VLM
69
142
0
28 Feb 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
95
102
0
23 Dec 2021
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
66
73
0
16 Nov 2021
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILaw
ELM
226
266
0
03 Oct 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
88
279
0
22 Jun 2021
StructuralLM: Structural Pre-training for Form Understanding
Chenliang Li
Bin Bi
Ming Yan
Wei Wang
Songfang Huang
Fei Huang
Luo Si
LMTD
AI4CE
88
134
0
24 May 2021
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Yiheng Xu
Tengchao Lv
Lei Cui
Guoxin Wang
Yijuan Lu
D. Florêncio
Cha Zhang
Furu Wei
MLLM
VLM
78
130
0
18 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
463
21,566
0
25 Mar 2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks
Collin Burns
Anya Chen
Spencer Ball
ELM
AILaw
53
195
0
10 Mar 2021
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
144
743
0
01 Jul 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
135
707
0
31 Dec 2019
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,587
0
05 Nov 2019
Choosing Transfer Languages for Cross-Lingual Learning
Yu-Hsiang Lin
Chian-Yu Chen
Jean Lee
Zirui Li
Yuyan Zhang
...
Zhisong Zhang
Xuezhe Ma
Antonios Anastasopoulos
Patrick Littell
Graham Neubig
91
233
0
29 May 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
168
370
0
27 May 2019
1