Previous [1] [2] [3] [4] [5] [6] [7] [8]

Journal of Inforamtion Science and Engineering, Vol. 16, No. 4, pp. 649-660 (July 2000)

Improvement in Connected Mandarin Digit Recognition
by Explicitly Modeling Coarticulatory Information

Ruey-Ching Shyu, Jhing-Fa Wang and Jau-Yien Lee*
Department of Electrical Engineering
National Cheng Kung University
Tainan, Taiwan 701, R.O.C.
*Department of Electrical Engineering
Chang Gung University
Taoyuan County, Taiwan 333, R.O.C.

The most successful training scheme for recognition of connected spoken digits is the segmental k-means algorithm, which implicitly captures the coarticulatory information of connected speech iteratively to establish reliable reference patterns. However, when this algorithm is applied to Mandarin digits, the obtained performance is inferior to that of English. Hence, a novel approach is proposed to build reliable reference patterns of connected Mandarin digits. Our method is to partition each training digit into three sections and they represent the coarticulation interacting with the preceding digit, the characteristic of the digit itself and the coarticulation interacting with the succeeding digit respectively. In this manner, the coarticulatory information is caught explicitly. Then we model these three sections separately using three Bayesian templates and a resultant multi-section Bayesian template is constructed for each reference Mandarin digit. The experimental result shows that the new method outperforms the segmental k-means by 3.2% when using a multi-speaker speech database.

Keywords: connected Mandarin digit recognition, segmental k-means algorithm, coarticulatory information, multi-section Bayesian template, level-building algorithm

Full Text () Retrieve PDF document (200007_07.pdf)

Received November 30, 1998; revised April 19, 1999; accepted June 25, 1999.
Communicated by Zen Chen.