| Previous | [ 1] | [ 2] | [ 3] | [ 4] | [ 5] | [ 6] | [ 7] | [ 8] | [ 9] | [ 10] | [ 11] | [ 12] | [ 13] | [ 14] | [ 15] | [ 16] | [ 17] | [ 18] | [ 19] | [ 20] |
¡@
Kuo-Chin Fan and Chien-Hsiang Huang
Institute of Computer Science and Information Engineering
National Central University
Chungli, 320 Taiwan
E-mail: kcfan@csie.ncu.edu.tw
In this paper, a novel italic detection and rectification method without the prerequisite
of character recognition is proposed. An italic style character can be obtained by
performing shear transformation on its corresponding non-italic style character. Traditional
italic detection methods have to be operated at least on the word, sentence or even
the whole paragraph. The merit of the proposed method is that it can be operated directly
on a single character so that more accurate statistical information can be obtained. The
rationale of our proposed method is that the difference of certain features derived from
italic style characters after shear transformation will be canceled, whereas the difference
will be more obvious for non-italic style (normal style) characters. In our proposed approach,
the virtual strokes embedded in the considered character image are extracted first.
Then, reverse transformation is operated on the considered character image. The 26 upper
and 26 lower alphabets are classified into three classes based on the structural information
of the extracted virtual strokes. The italic and non-italic style characters can then
be distinguished based on the classification rule devised for each class of characters. Last,
the exact shear angle of the identified italic character is calculated to perform more accurate
reverse shear transformation to rectify the italic style character into normal
(non-italic) style character to facilitate the later OCR task. Experiments were conducted
on 50 document images with mixed italic and normal style characters. Satisfactory accuracy
rate 99.59% for italic style characters and 99.85% for normal style characters are
achieved. Experimental results verify the validity of our proposed method in distinguishing
italic and non-italic style characters.
Received January 10, 2005; revised March 30, 2005; accepted May 2, 2005.
Communicated by Pau-Choo Chung.
* This work was supported in grant by MOE Program for Promoting Academic Excellent of Universities under
grant No. 91-H-FA08-1-4.