汉字识别与根本结构的中风树

论文标题

汉字识别与根本结构的中风树

Chinese Character Recognition with Radical-Structured Stroke Trees

论文作者

Yu, Haiyang, Chen, Jingye, Li, Bin, Xue, Xiangyang

论文摘要

深度学习的蓬勃发展见证了汉字认可的迅速发展。但是，测试角色可能与培训数据集的分布不同仍然是一个巨大的挑战。基于单层表示（字符级，自由基级别或中风级别）的现有方法可能过于敏感分布变化（例如，由于模糊，遮挡和零射击问题而引起的），或者对于一一对一对雄性的模棱两可。在本文中，我们将每个汉字表示为中风树（根据其激进结构组织），以相当的方式完全利用自由基水平和中风水平的优点。我们提出了一个两阶段的分解框架，其中特征到 - 激进的解码器感知自由基结构和根部区域，而自由基到中风解码器进一步根据自由基区域的特征进一步预测了中风序列。生成的自由基结构和中风序列被编码为根治性的卒中树（RSST），该中风树（RSST）基于提议的加权编辑距离，以符合RSST Lexicon中最接近的候选角色。我们广泛的实验结果表明，随着分布差异在模糊，遮挡和零拍的情况下，该方法的表现优于最先进的单级方法，这确实验证了所提出方法的鲁棒性。

The flourishing blossom of deep learning has witnessed the rapid development of Chinese character recognition. However, it remains a great challenge that the characters for testing may have different distributions from those of the training dataset. Existing methods based on a single-level representation (character-level, radical-level, or stroke-level) may be either too sensitive to distribution changes (e.g., induced by blurring, occlusion, and zero-shot problems) or too tolerant to one-to-many ambiguities. In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way. We propose a two-stage decomposition framework, where a Feature-to-Radical Decoder perceives radical structures and radical regions, and a Radical-to-Stroke Decoder further predicts the stroke sequences according to the features of radical regions. The generated radical structures and stroke sequences are encoded as a Radical-Structured Stroke Tree (RSST), which is fed to a Tree-to-Character Translator based on the proposed Weighted Edit Distance to match the closest candidate character in the RSST lexicon. Our extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art single-level methods by increasing margins as the distribution difference becomes more severe in the blurring, occlusion, and zero-shot scenarios, which indeed validates the robustness of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题