REZCR: Zero-shot Character Recognition via Radical Extraction

12 July 2022

Xiaolei Diao

Abstract

The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets. Character image datasets are also affected by such unbalanced data distribution due to differences in character usage frequency. Thus, current character recognition methods are limited when applied in the real world, especially for the categories in the tail which are lacking training samples, e.g., uncommon characters. In this paper, we propose a zero-shot character recognition framework via radical extraction (REZCR) to improve the recognition performance of few-sample character categories in the tail. Specifically, we exploit radicals, the graphical units of characters, by decomposing and reconstructing characters according to orthography. REZCR consists of an attention-based radical information extractor (RIE) and a knowledge graph-based character reasoner (KGR). The RIE aims to recognize candidate radicals and their possible structural relations from character images. The results are then fed into KGR to recognize the target character by reasoning with a pre-designed knowledge graph. We validate our method on multiple datasets, and REZCR shows promising experimental results, especially on few-sample character datasets.

View on arXiv

Comments on this paper