Chi-Wei Lee and Zen Chen
Institute of Computer Science and Information Engineering
National Chiao Tung University
Hsinchu, Taiwan, R.O.C.
This paper primarily solves the on-line handwritten Chinese character recognition problem by using a stroke structural sequence code although the code can be applied to the off-line case as well, provided that the character strokes can be extracted in advance. The stroke structural sequence of a handwritten Chinese character varies due to handwriting variations among the writers. This makes the recognition problem difficult. After the extensive analysis of the handwritten character samples, we present a method to derive a unique stroke structural sequence code from handwritten Chinese character input. We seek consistent stroke features from sample codes to define a preclassification code for a character. Based on the preclassification code, 5401 Chinese characters can be grouped into clusters, each of which contains a relatively small number of characters; in fact, many contain only a single character. Next, we examine the characters in each cluster to see which ones have the correct preclassification code (called legitimate cluster characters), and we discard those which are included in the cluster due to writing variations. For each legitimate character in a cluster, we search for the individual consistent stroke features not yet used in the preclassification code to design the code for detailed matching. Here, the detailed matching code must cover all samples which come from the same character (i.e., the completeness condition), but not samples from other cluster characters (i.e., the consistency condition). Finally, the overall process of handwritten Chinese character recognition is given, which consists of three stages: preclassification, detailed matching and stroke feature perturbation. Each input character is first classified based on its preclassification code. If it falls into a preclassification cluster, then it goes through detailed matching against legitimate characters in the cluster. If a match is found, the recognition process terminates with an aswer; if not, a proposed stroke feature perturbation technique is applied to the input character to obtain a perturbed stroke structural sequence code, and a new recognition cycle is repeated. The process ends when a match is found or no more new perturbed codes are possible. Experimental results indicate that the proposed method can handle the recognition problem caused by ordinary handwriting variations.
Keywords: handwritten Chinese character recognition, handwriting variations, stroke structural sequence, character preclassification, detailed matching, stroke feature perturbation
Received April 9, 1996; revised October 30, 1996.
Communicated by Wen-Lien Hsu.