Koong H. C. Lin, Tung-Bo Chen and Von-Wun Soo
Department of Computer Science,
National Tsing Hua University,
HsinChu, 300, Taiwan, R.O.C.
Learning semantics in understanding a sentence has long been a challenging problem in natural language processing. Recently, many attempts have successfully demonstrated the feasibility of performing such tasks in a connectionist approach. In this study, we used the extended back-propagation learning method on recurrent networks to learn the lexical encodings of the thematic information in parsing Chinese sentences. Totally, 31 simple sentences (92 words) from a set of verdict documents were prepared with correct thematic role assignments as the training set for supervised learning experiments on a simple recurrent network (SRN) and a modified SRN (MSRN), respectively. The test set of 83 legal sentences was derived by adding/deleting words, using unknown words or combining parts of sentences from the training set. The experimental results show MSRN performed slightly better than did SRN. We also used a merge clustering algorithm and Kohonen's feature map to observe the learned distributed lexical representations. We conclude that, when using the thematic role assignment learning method to encode lexicons, words with similar roles tend to be grouped together. This sheds some light on semantic associative reasoning.
Keywords: thematic role assignment, recurrent neural networks, extended backpropagation learning, connectionist parsing, distributed lexical encodings, Mandarin Chinese understanding
Received March 1, 1995; revised April 15, 1995.
Communicated by Hsi-Jian Lee.