Jia-lin Shen and Wen L. Hwang
Although the delta and RASTA methods have been widely used in extracting the temporal properties of stationary features for robust speech recognition, there is still a need to investigate new temporal features for better performance. In this paper, we present two new temporal features for robust processing of speech signals with emphasis on microphone variations. First, the temporal feature is derived from a bank of RASTA-like filters, in where the parameters of each filter in this bank are estimated according to the statistical properties of the speech signals. Secondly, a parameterized temporal filter (called PTF) is proposed. The filter can be described by four parameters, the passband, the beginning transition, the ending transition, and the smoothness of the magnitude of the filter response. These parameters altogether determine the magnitude of the frequency response of the PTF, and a transformation algorithm is then used to derive the temporal coefficients with real and causal characteristics. The discriminative ability of PTF features can be further enhanced using the minimum classification error (MCE) algorithm. Experimental results show that the RASTA features is inferior to the PTF features both in quiet condition and in the presence of microphone variations.