ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1507.05717
  4. Cited By
An End-to-End Trainable Neural Network for Image-based Sequence
  Recognition and Its Application to Scene Text Recognition

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

21 July 2015
Baoguang Shi
X. Bai
Cong Yao
    VLM
ArXivPDFHTML

Papers citing "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition"

50 / 645 papers shown
Title
DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation
DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation
Naphat Nithisopa
Teerapong Panboonyuen
ViT
26
0
0
07 May 2025
Visual Text Processing: A Comprehensive Review and Unified Evaluation
Visual Text Processing: A Comprehensive Review and Unified Evaluation
Yan Shu
Weichao Zeng
Fangmin Zhao
Zeyu Chen
Z. Li
...
Paolo Rota
Xiang Bai
Lianwen Jin
Xu-Cheng Yin
N. Sebe
CoGe
59
0
0
30 Apr 2025
Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network
Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network
Lu Pan
Yu-Hsuan Huang
Hongxia Xie
Cheng Zhang
H Zhao
Hong-Han Shuai
Wen-Huang Cheng
23
0
0
19 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agent
ViMo: A Generative Visual GUI World Model for App Agent
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
VGen
51
0
0
15 Apr 2025
UniRVQA: A Unified Framework for Retrieval-Augmented Vision Question Answering via Self-Reflective Joint Training
UniRVQA: A Unified Framework for Retrieval-Augmented Vision Question Answering via Self-Reflective Joint Training
Jiaqi Deng
Kaize Shi
Zonghan Wu
Huan Huo
Dingxian Wang
Guandong Xu
21
0
0
05 Apr 2025
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Xingguang Zhang
Nicholas Chimitt
Xijun Wang
Yu Yuan
Stanley H. Chan
36
0
0
03 Apr 2025
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior
NCAP: Scene Text Image Super-Resolution with Non-CAtegorical Prior
Dongwoo Park
Suk Pil Ko
153
0
0
01 Apr 2025
Leveraging Contrast Information for Efficient Document Shadow Removal
Leveraging Contrast Information for Efficient Document Shadow Removal
Yong-Jin Liu
Jiancheng Huang
Na Liu
Mingfu Yan
Yi Huang
Shifeng Chen
38
0
0
01 Apr 2025
Improving Applicability of Deep Learning based Token Classification models during Training
Improving Applicability of Deep Learning based Token Classification models during Training
Anket Mehra
Malte Prieß
Marian Himstedt
46
0
0
28 Mar 2025
Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts
Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts
Jan Kohút
Michal Hradiš
78
0
0
25 Mar 2025
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Yifei Zhang
Chang-Shu Liu
Jin Wei
Xiaomeng Yang
Yu Zhou
Can Ma
Xiangyang Ji
68
2
0
24 Mar 2025
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Andrea Maracani
Savas Ozkan
Sijun Cho
Hyowon Kim
Eunchung Noh
Jeongwon Min
Cho Jung Min
Dookun Park
Mete Ozay
38
0
0
20 Mar 2025
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition
Ritabrata Chakraborty
Shivakumara Palaiahnakote
Umapada Pal
Cheng-Lin Liu
VLM
47
0
0
19 Mar 2025
Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to Devanagari
Historic Scripts to Modern Vision: A Novel Dataset and A VLM Framework for Transliteration of Modi Script to Devanagari
Harshal Kausadikar
Tanvi Kale
Onkar Susladkar
Sparsh Mittal
60
0
0
17 Mar 2025
MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMs
T. Zhang
Zhuoxuan Jiang
Haotian Zhang
Lin Lin
Shaohua Zhang
LRM
65
0
0
06 Mar 2025
TextDoctor: Unified Document Image Inpainting via Patch Pyramid Diffusion Models
Wanglong Lu
Lingming Su
Jingjing Zheng
Vinícius Veloso de Melo
Farzaneh Shoeleh
J. Hawkin
T. Tricco
Hanli Zhao
Xianta Jiang
DiffM
58
0
0
06 Mar 2025
DashCop: Automated E-ticket Generation for Two-Wheeler Traffic Violations Using Dashcam Videos
Deepti Rawat
Keshav Gupta
Aryamaan Basu Roy
Ravi Kiran Sarvadevabhatla
36
0
0
01 Mar 2025
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Y. Liu
Xiang Bai
51
1
0
22 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
106
0
0
12 Feb 2025
PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts
Badri Vishal Kasuba
Dhruv Kudale
Venkatapathy Subramanian
P. Chaudhuri
Ganesh Ramakrishnan
48
0
0
10 Feb 2025
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Jiawei Liu
Yuanzhi Zhu
Feiyu Gao
Z. Yang
P. Wang
Junyang Lin
Xinyu Wang
Wenyu Liu
DiffM
45
0
0
08 Jan 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
145
0
0
06 Jan 2025
Efficient Video-Based ALPR System Using YOLO and Visual Rhythm
Victor Nascimento Ribeiro
Nina S. T. Hirata
30
0
0
04 Jan 2025
Instruction-Guided Scene Text Recognition
Instruction-Guided Scene Text Recognition
Yongkun Du
Z. Chen
Yuchen Su
Caiyan Jia
Yu-Gang Jiang
75
3
0
03 Jan 2025
Disentanglement and Compositionality of Letter Identity and Letter
  Position in Variational Auto-Encoder Vision Models
Disentanglement and Compositionality of Letter Identity and Letter Position in Variational Auto-Encoder Vision Models
Bruno Bianchi
Aakash Agrawal
S. Dehaene
Emmanuel Chemla
Yair Lakretz
DRL
CoGe
70
0
0
11 Dec 2024
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
Xingsong Ye
Yongkun Du
Yunbo Tao
Z. Chen
DiffM
110
0
0
02 Dec 2024
DLaVA: Document Language and Vision Assistant for Answer Localization
  with Enhanced Interpretability and Trustworthiness
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
Ahmad Mohammadshirazi
Pinaki Prasad Guha Neogi
Ser-Nam Lim
R. Ramnath
70
1
0
29 Nov 2024
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition
Yongkun Du
Z. Chen
Hongtao Xie
Caiyan Jia
Yu-Gang Jiang
85
1
0
24 Nov 2024
Boosting Semi-Supervised Scene Text Recognition via Viewing and
  Summarizing
Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
Yadong Qu
Yuxin Wang
Bangbang Zhou
Z. Wang
Hongtao Xie
Yongdong Zhang
85
0
0
23 Nov 2024
Learning based Geéz character handwritten recognition
Learning based Geéz character handwritten recognition
Hailemicael Lulseged Yimer
Hailegabriel Dereje Degefa
Marco Cristani
Federico Cunico
64
0
0
20 Nov 2024
Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition
T. Lin
Jinglei Zhang
Yi Xu
Kai Chen
Rui Zhang
Cheng Chen
38
0
0
18 Nov 2024
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text
  Recognition
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition
Jingyang Zhang
Chang-rui Liu
Chun Yang
24
2
0
10 Nov 2024
HIP: Hierarchical Point Modeling and Pre-training for Visual Information
  Extraction
HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Rujiao Long
Pengfei Wang
Zhibo Yang
Cong Yao
41
0
0
02 Nov 2024
Visual Text Matters: Improving Text-KVQA with Visual Text Entity
  Knowledge-aware Large Multimodal Assistant
Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant
A. S. Penamakuri
Anand Mishra
26
1
0
24 Oct 2024
Integrating Canonical Neural Units and Multi-Scale Training for
  Handwritten Text Recognition
Integrating Canonical Neural Units and Multi-Scale Training for Handwritten Text Recognition
Zi-Rui Wang
24
0
0
24 Oct 2024
Human-Inspired Long-Term Indoor Localization in Human-Oriented
  Environment
Human-Inspired Long-Term Indoor Localization in Human-Oriented Environment
Nicky Zimmerman
Matteo Sodano
26
0
0
16 Oct 2024
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
ChartKG: A Knowledge-Graph-Based Representation for Chart Images
Zhiguang Zhou
Haoxuan Wang
Zhengqing Zhao
Fengling Zheng
Yongheng Wang
Wei Chen
Yong Wang
32
0
0
13 Oct 2024
Grounding Partially-Defined Events in Multimodal Data
Grounding Partially-Defined Events in Multimodal Data
Kate Sanders
Reno Kriz
David Etter
Hannah Recknor
Alexander Martin
Cameron Carpenter
Jingyang Lin
Benjamin Van Durme
27
2
0
07 Oct 2024
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
HATFormer: Historic Handwritten Arabic Text Recognition with Transformers
Adrian Chan
Anupam Mijar
Mehreen Saeed
Chau-Wai Wong
Akram Khater
41
0
0
03 Oct 2024
AI-Powered Augmented Reality for Satellite Assembly, Integration and
  Test
AI-Powered Augmented Reality for Satellite Assembly, Integration and Test
Alvaro Patricio
Joao Valente
Atabak Dehban
Ines Cadilha
Daniel Reis
Rodrigo Ventura
27
1
0
26 Sep 2024
Text Image Generation for Low-Resource Languages with Dual Translation
  Learning
Text Image Generation for Low-Resource Languages with Dual Translation Learning
Chihiro Noguchi
Shun Fukuda
Shoichiro Mihara
Masao Yamanaka
DiffM
28
0
0
26 Sep 2024
General Detection-based Text Line Recognition
General Detection-based Text Line Recognition
Raphael Baena
Syrine Kalleli
Mathieu Aubry
151
0
0
25 Sep 2024
One Model for Two Tasks: Cooperatively Recognizing and Recovering
  Low-Resolution Scene Text Images by Iterative Mutual Guidance
One Model for Two Tasks: Cooperatively Recognizing and Recovering Low-Resolution Scene Text Images by Iterative Mutual Guidance
Minyi Zhao
Yang Wang
Jihong Guan
Shuigeng Zhou
30
0
0
22 Sep 2024
VL-Reader: Vision and Language Reconstructor is an Effective Scene Text
  Recognizer
VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer
Humen Zhong
Zhibo Yang
Zhaohai Li
Peng Wang
Jun Tang
Wenqing Cheng
Cong Yao
23
1
0
18 Sep 2024
HTR-VT: Handwritten Text Recognition with Vision Transformer
HTR-VT: Handwritten Text Recognition with Vision Transformer
Yuting Li
Dexiong Chen
Tinglong Tang
Xi Shen
ViT
21
7
0
13 Sep 2024
Boosting CNN-based Handwriting Recognition Systems with Learnable
  Relaxation Labeling
Boosting CNN-based Handwriting Recognition Systems with Learnable Relaxation Labeling
S. Ferro
Alessandro Torcinovich
Arianna Traviglia
Marcello Pelillo
31
0
0
09 Sep 2024
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction
Lei Sheng
Shuai-Shuai Xu
LMTD
34
0
0
08 Sep 2024
RoomDiffusion: A Specialized Diffusion Model in the Interior Design
  Industry
RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry
Zhaowei Wang
Ying Hao
Hao Wei
Qing Xiao
Lulu Chen
Yulong Li
Yue Yang
Tianyi Li
DiffM
30
0
0
05 Sep 2024
Platypus: A Generalized Specialist Model for Reading Text in Various
  Forms
Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Peng Wang
Zhaohai Li
Jun Tang
Humen Zhong
Fei Huang
Zhibo Yang
Cong Yao
VLM
ObjD
40
2
0
27 Aug 2024
Decoder Pre-Training with only Text for Scene Text Recognition
Decoder Pre-Training with only Text for Scene Text Recognition
Shuai Zhao
Yongkun Du
Zhineng Chen
Yu-Gang Jiang
33
0
0
11 Aug 2024
1234...111213
Next