| Previous | [ 1] | [ 2] | [ 3] | [ 4] | [ 5] | [ 6] | [ 7] | [ 8] | [ 9] | [ 10] | [ 11] | [ 12] | [ 13] | [ 14] | [ 15] | [ 16] | [ 17] | [ 18] | [ 19] |
¡@
Tyne Liang and Jian-Shin Chen
Institute of Computer and Information Science
National Chiao Tung University
Hsinchu, 300 Taiwan
With rapid growth of electronic literature in recent years, efficient named entities
extraction becomes an indispensable part of knowledge base construction automation. In
this paper an entity extraction system useful as biomedical knowledge acquisition was
presented. Unlike most entity extraction systems which do not concern term variants, the
proposed system was incorporated with a rule-based resolver to recover the full forms of
those target entities from the coordination variants. The resolution approach was proved
with GENIA Corpus 3.0 to be feasible by showing 88.51% recall and 57.04% precision.
On the other hand, the kernel part of the system was based on Hidden Markov Model
(HMMs) by setting appropriate set of input features extracted from training corpus. With
various experiments on different corpora the proposed system achieved promising results
at entity boundary identification and at classification as well.
Received July 7, 2004; revised November 12, 2004 & August 12, 2005; accepted November 17, 2005.
Communicated by Suh-Yin Lee.
* This paper was partially supported by the National Science Council of Taiwan, R.O.C., under contract No.
NSC 91-2213-E-009-082.