Research Descriptions
 


        The idea of our recent projects is to build a Chinese natural language processing system with learning capability, such that the system can analyze new documents and extract linguistic and world knowledge automatically. To achieve the above goal, the system has been equipped with the following functions; a) lexical analysis, b) unknown word identification and semantic and syntactic category prediction, c) sentence parsing, and d) knowledge extraction and representation. These are the basic tools and building blocks for the on-line automatic learning system. For a computer with learning capability, the fundamental requirement is systems with a basic knowledge about language and the world to the extent that, by using that knowledge, the system can explore new words, new concepts, new conceptual relations, and new linguistic structures. Learning new linguistic patterns and semantic relations relies upon parsers with inference ability. These parsers predict the structures of the input sentences by inferring the semantic relations between the semantic categories of new words and their contextual words. The semantic attraction model will be applied in the processes of sentence parsing. The semantic attraction values between words and semantic classes are trained from the web corpus and enhanced along each step of the learning process. The system performance will be enhanced automatically by online self-learning.