Page 117 - untitled

P. 117

ˮอ͏

Wang, Hsin-min ਿ ͉ ༟ ࣘ Research Description
Research Description
ਿ ͉ ༟ ࣘ
Our research interests include speech pro- Due to the rapid advance of multimedia and
ᔖcc၈jਓ޼ӺࡰAssociate Research Fellow (2002/12--)
cessing, natural language processing, multimedia internet technology, there are many digital library
௰৷ኪዝjPh.D., EE, National Taiwan University (1995)
information retrieval, and pattern recognition. projects worldwide on how multimedia digital li-
ཥcc༑j+886-2-2788-3799 ext. 1714 braries can be established and used. We have been
Communicating with computers using speech
ෂccॆj+886-2-2782-4814 studying audio segmentation, clustering, automatic
has been a dream of many people since the invention
ཥɿڦᇌjwhm@iis.sinica.edu.tw speech recogntion, indexing, and retrieval of Man-
of computers. Progress towards realizing this dream
darin broadcast news for several years and have Research Fellows
ၣccࠫjhttp://www.iis.sinica.edu.tw/pages/whm has been slow but steady through the development
developed several basic technologies as well as
of systems supporting voice commands, dictation,
prototype retrieval systems. More recently, we have
text-to-speech synthesis, and human-computer spo-
extended our studies to music information retrieval.
ken dialogue. Speech recognition, speech synthesis,
Our research has been focused mainly on query by
language understanding, dialogue management, etc.
• Assistant Research Fellow, Institute of Information singing/humming and solo vocal modeling. Our
Research Fellows
are crucial to the development of human-computer
S
Science, Academia Sinica (1996/11-2002/12) future plans include further improvement of the
speech interface. Our research has been focused
• Postdoctoral Fellow, Institute of Information Sci- ޼Ӻᔊʧ mainly on speech recognition, speech synthesis and speech and music information retrieval technology.
޼Ӻᔊʧ
ence, Academia Sinica (1995/10-1996/11)
speaker recognition.
• Ph.D., EE, National Taiwan University (1995) Ңࡁٙ޼Ӻጳሳܼ̍ႧࠪஈଣeІ್ႧԊஈ
• B.S., EE, National Taiwan University (1989) ଣeεద᜗༟ৃᏨ॰ʿᅼۨᗆйf
• Technical Paper Award, The Chinese Institute of ೯࢝ɛዚႧࠪʧࠦ݊ɛᗳІཥ໘೯׼˸Ըٙ Selected Publications
Selected Publications
Engineers (1995)
ྫྷซdᅰɤϋԸd੽Ⴇܸࠪ˿eႧࠪ፩ɝʿႧࠪΥ
• Editorial board member, International Journal of 1. Chih-Heng Lin, Chien-Hsing Wu, Pei-Yih Ting, and Hsin-min Wang, 7. Hsin-min Wang, Shi-sian Cheng, and Yong-cheng Chen, "The SoVideo
ϓdՑᔊఊٙɹႧʹሔӻ୕dவࡈྫྷซ͍ᇠ࿔ή஼ "Frameworks for recognition of Mandarin syllables with tone using Mandarin Chinese broadcast news retrieval system," International
Computational Linguistics and Chinese Language
ӉྼତfႧࠪ፫ᗆeႧࠪΥϓeႧԊə༆ʿʹሔ၍ sub-syllabic units," Speech Communication, 18(2), pp. 175-190, 1996. Journal of Speech Technology, 7(2-3), pp. 189-202, April-July 2004.
Processing
2. Hsin-min Wang, Tai-hsuan Ho, Rung-chiung Yang, Jia-lin Shen, Bo- 8. Berlin Chen, Hsin-min Wang, and Lin-shan Lee, "A discriminative
ଣഃҦஔ݊೯࢝ɛዚႧࠪʧࠦʔ̙אॹٙࠅ΁fҢ ren Bai, Jenn-chau Hong, Wei-peng Chen, Tong-lo Yu, and Lin-shan HMM/n-gram-based retrieval approach for Mandarin spoken docu-
ࡁͦۃٙ޼Ӻ˴ࠅഹࠠίႧࠪ፫ᗆeႧࠪΥϓʿႧ Lee, "Complete recognition of continuous Mandarin speech for Chi- ments," ACM Trans. on Asian Language Information Processing, 3(2),
nese language with very large vocabulary using limited training data," pp. 128-145, June 2004.
٫፫ᗆf IEEE Trans. on Speech and Audio Processing, 5(2), pp. 195-200, 9. Wei-Ho Tsai, Dwight Rodgers, and Hsin-min Wang, "Blind clustering
March 1997. of popular music recordings based on singer voice characteristics,"
ڐϋԸdᎇഹၣ༩ձεద᜗Ҧஔٙ೯࢝dᅂࠪ 3. Jia-lin Shen, Hsin-min Wang, Ren-yuan Lyu, and Lin-shan Lee, "Au- Computer Music Journal, 28(3), pp. 68-78, Fall 2004.
tomatic selection of phonetically distributed sentence sets for speaker 10. Shih-Sian Cheng, Hsin-min Wang, and Hsin-Chia Fu, "A model-selec-
ᅰЗ௹ي᎜ٙܔͭϓމ΢਷ᅰЗ௹ي᎜ࠇ೥ٙࠠᓃ
adaptation with application to large vocabulary Mandarin speech tion-based self-splitting Gaussian mixture learning with application to
ʈЪʘɓfவ఻ϋdҢࡁ০࿁ᄿᅧeཥൖอၲක೯ recognition," Computer Speech and Language, 13(1), pp. 79-97, Jan. speaker identiﬁ cation," EURASIP Journal on Applied Signal Process-
1999. ing, 2004(17), pp. 2626-2639, Dec 2004.
ࠪৃʱݬeʱ໊eႧࠪ፫ᗆe॰ˏʿᏨ॰Ҧஔdʊ
4. Lee-feng Chien, Hsin-min Wang, Bo-ren Bai and Sung-chien Lin, 11. Wei-Ho Tsai and Hsin-min Wang, "On the extraction of vocal-related
ଢ଼ጐ޴຅຾᜕dԨܔ࿴ҁϓᕑۨᏨ॰ӻ୕f̤̮d "A spoken access approach for Chinese text and speech information information to facilitate the management of popular music collec-
retrieval," Journal of the American Society for Information Science, tions," in Proc. IEEE/ACM Joint Conference on Digital Libraries
ҢࡁɰҳɝࠪᆀᏨ॰޼Ӻd˴ࠅഹࠠί˸ࡨਨ˙ό
51(4), pp. 313-323, 2000. (JCDL2005), USA, June 2005.
ݟ༔ဂϜʿဂᑊڦ໮ᅼۨ൙Пf͊Ը఻ϋdεద᜗ 5. Hsin-min Wang, "Experiments in syllable-based retrieval of broadcast 12. Chiu-yu Tseng, Shao-huang Pin, Yehlin Lee, Hsin-min Wang, Yong-
news speech in Mandarin Chinese," Speech Communication, 32(1-2), cheng Chen, "Fluent speech prosody: framework and modeling,"
ᑊࠪ༟ৃᏨ॰ʥ݊Ңࡁٙࠠᓃ޼Ӻධͦf
pp. 49-60, Sept. 2000. Speech Communication, 46(3-4), pp. 284-309, July 2005.
6. Berlin Chen, Hsin-min Wang, and Lin-shan Lee, "Discriminating ca- 13. Wei-Ho Tsai and Hsin-min Wang, "Automatic singer recognition of
pabilities of syllable-based features and approaches of utilizing them popular music recordings via estimation and modeling of solo vocal
for voice retrieval of speech information in Mandarin Chinese," IEEE signals," IEEE Trans. on Audio, Speech, and Language Processing,
Trans. on Speech and Audio Processing, 10(5), pp. 303-314, July 14(1), pp. 330-341, Jan 2006.
2002.

106 107

112 113 114 115 116 117 118 119 120 121 122