Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.15263
Cited By
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
30 November 2021
Byeonghu Na
Yoonsik Kim
Sungrae Park
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features"
11 / 11 papers shown
Title
Instruction-Guided Scene Text Recognition
Yongkun Du
Z. Chen
Yuchen Su
Caiyan Jia
Yu-Gang Jiang
75
3
0
03 Jan 2025
Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition
Bangbang Zhou
Yadong Qu
Zixiao Wang
Zicheng Li
Boqiang Zhang
Hongtao Xie
47
1
0
08 Jul 2024
Efficient scene text image super-resolution with semantic guidance
LeoWu TomyEnrique
Xiangcheng Du
Kangliang Liu
Han Yuan
Zhao Zhou
Cheng Jin
VLM
31
2
0
20 Mar 2024
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model
Jiahao Lyu
Jin Wei
Gangyan Zeng
Zeng Li
Enze Xie
Wei Wang
Yu Zhou
VLM
29
3
0
15 Mar 2024
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
38
6
0
29 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
62
1
0
19 Dec 2023
Scene Text Recognition Models Explainability Using Local Features
M. Ty
Rowel Atienza
36
1
0
14 Oct 2023
Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement
Han Guo
Tao Dai
G. MEng
Shutao Xia
26
11
0
19 Jul 2023
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
CLIP
VLM
23
25
0
23 May 2023
OCR-IDL: OCR Annotations for Industry Document Library Dataset
Ali Furkan Biten
Rubèn Pérez Tito
Lluís Gómez
Ernest Valveny
Dimosthenis Karatzas
25
26
0
25 Feb 2022
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
1