Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.00311
Cited By
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining
1 June 2022
Pengyuan Lyu
Chengquan Zhang
Shanshan Liu
Meina Qiao
Yangliu Xu
Liang Wu
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining"
21 / 21 papers shown
Title
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
34
0
0
11 May 2025
DocTTT: Test-Time Training for Handwritten Document Recognition Using Meta-Auxiliary Learning
Wenhao Gu
Li Gu
Ziqiang Wang
Ching Yee Suen
Yang Wang
53
0
0
22 Jan 2025
General Detection-based Text Line Recognition
Raphael Baena
Syrine Kalleli
Mathieu Aubry
151
0
0
25 Sep 2024
Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition
Carlos Peñarrubia
Carlos Garrido-Munoz
J. J. Valero-Mas
Jorge Calvo-Zaragoza
37
1
0
17 Apr 2024
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
38
6
0
29 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
62
1
0
19 Dec 2023
DTrOCR: Decoder-only Transformer for Optical Character Recognition
Masato Fujitake
46
35
0
30 Aug 2023
Combining OCR Models for Reading Early Modern Printed Books
Mathias Seuret
Janne van der Loop
Nikolaus Weichselbaumer
Martin Mayr
J. Molnar
Tatjana Hass
Florian Kordon
Anguelos Nicolau
Vincent Christlein
26
2
0
11 May 2023
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Liya Wang
A. Tien
44
7
0
28 Jan 2023
Transferring General Multimodal Pretrained Models to Text Recognition
Junyang Lin
Xuancheng Ren
Yichang Zhang
Gao Liu
Peng Wang
An Yang
Chang Zhou
34
4
0
19 Dec 2022
Multi-Granularity Prediction for Scene Text Recognition
Peng Wang
Cheng Da
Cong Yao
66
48
0
08 Sep 2022
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition
Mingkun Yang
Minghui Liao
Pu Lu
Jing Wang
Shenggao Zhu
Hualin Luo
Qingzhen Tian
X. Bai
SSL
33
55
0
01 Jul 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
41
528
0
27 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Minghao Li
Tengchao Lv
Jingye Chen
Lei Cui
Yijuan Lu
D. Florêncio
Cha Zhang
Zhoujun Li
Furu Wei
ViT
98
343
0
21 Sep 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
304
3,708
0
11 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
267
3,371
0
09 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
188
515
0
26 Jan 2016
1