Page 68 - My FlipBook
P. 68
工
智計 The Construction of a Concept-based Chinese
慧
畫 Knowledge Base with Semantic Composition Capability
Arti cial Intelligence Projects Principal Investigator: Dr. Wei-Yun Ma
Project Period: 2019/1~2022/12
My research team was awarded a four-year (2019/1 ~ utilization. We have developed respective models with both
2022/12) AI-related project from the Ministry of Science English and Chinese versions. We use the former primarily
and Technology (MOST) that began in 2019. MOST grants for model comparisons to meet academic requirements,
USD$230,000 to this project each year. The objective is whereas the latter are employed for practical construction
to develop a Chinese knowledge base with the ability to of a Chinese knowledge base.
resolve realistic downstream problems and that can support
AI development in Taiwan. In 2012, Google presented In the first year of the project (2019), we mainly focused
their knowledge base (Knowledge Graph) as an external on knowledge acquisition from raw texts and knowledge
resource to signi cantly enhance the value of information representation. For knowledge acquisition from raw texts,
returned by Google searches. Since then, construction of we developed GraphRel, an end-to-end relation extraction
knowledge bases has attracted a lot of attention, both model that uses graph convolutional networks (GCNs) to
within industry and academia. Consequently, various jointly learn named entities and relations (see Figure 1).
applications of knowledge bases have been successfully Unlike previous baselines, we consider the interactions
developed and deployed. Given rapid developments in between named entities and relations via a 2-phase
deep learning, there is a boom in encoding the information relation-weighted GCN to better extract relations. Linear
from knowledge bases into deep learning models to and dependency structures were both used to extract
empower them to resolve various downstream problems. sequential and regional features of the input text, and
then a complete word graph was employed to extract
Developing a practical and high-quality knowledge base implicit features among all word pairs of that text. Through
for information processing is crucial yet challenging. our graph-based approach, predictions for overlapping
The number of entities involved can number in the relations are substantially improved over previously
millions, tens of millions, or even be infinite. Accordingly, reported sequential approaches. We have evaluated
it is impossible to build a knowledge base manually, GraphRel against two public datasets: NYT and WebNLG.
so it must be built automatically. Concurrently, the Our results show that GraphRel maintains high precision
inference mechanism of the knowledge base, including and exhibits substantially enhanced recall. Moreover,
causality, relationships, and action processes, must also we found that GraphRel outperforms the state of the art
be considered, so the form of representation and the by 3.2% F1 score on NYT and 5.8% F1 score on WebNLG,
representation ability of the knowledge base must be thereby achieving in 2019 a new state of the art for relation
carefully designed. To build a practical knowledge base extraction.
of this nature, we are currently engaged in researching
knowledge acquisition, representation, reasoning and
Figure 1 : Overview of GraphRel with 2-phase relation-weighted GCN.
66
智計 The Construction of a Concept-based Chinese
慧
畫 Knowledge Base with Semantic Composition Capability
Arti cial Intelligence Projects Principal Investigator: Dr. Wei-Yun Ma
Project Period: 2019/1~2022/12
My research team was awarded a four-year (2019/1 ~ utilization. We have developed respective models with both
2022/12) AI-related project from the Ministry of Science English and Chinese versions. We use the former primarily
and Technology (MOST) that began in 2019. MOST grants for model comparisons to meet academic requirements,
USD$230,000 to this project each year. The objective is whereas the latter are employed for practical construction
to develop a Chinese knowledge base with the ability to of a Chinese knowledge base.
resolve realistic downstream problems and that can support
AI development in Taiwan. In 2012, Google presented In the first year of the project (2019), we mainly focused
their knowledge base (Knowledge Graph) as an external on knowledge acquisition from raw texts and knowledge
resource to signi cantly enhance the value of information representation. For knowledge acquisition from raw texts,
returned by Google searches. Since then, construction of we developed GraphRel, an end-to-end relation extraction
knowledge bases has attracted a lot of attention, both model that uses graph convolutional networks (GCNs) to
within industry and academia. Consequently, various jointly learn named entities and relations (see Figure 1).
applications of knowledge bases have been successfully Unlike previous baselines, we consider the interactions
developed and deployed. Given rapid developments in between named entities and relations via a 2-phase
deep learning, there is a boom in encoding the information relation-weighted GCN to better extract relations. Linear
from knowledge bases into deep learning models to and dependency structures were both used to extract
empower them to resolve various downstream problems. sequential and regional features of the input text, and
then a complete word graph was employed to extract
Developing a practical and high-quality knowledge base implicit features among all word pairs of that text. Through
for information processing is crucial yet challenging. our graph-based approach, predictions for overlapping
The number of entities involved can number in the relations are substantially improved over previously
millions, tens of millions, or even be infinite. Accordingly, reported sequential approaches. We have evaluated
it is impossible to build a knowledge base manually, GraphRel against two public datasets: NYT and WebNLG.
so it must be built automatically. Concurrently, the Our results show that GraphRel maintains high precision
inference mechanism of the knowledge base, including and exhibits substantially enhanced recall. Moreover,
causality, relationships, and action processes, must also we found that GraphRel outperforms the state of the art
be considered, so the form of representation and the by 3.2% F1 score on NYT and 5.8% F1 score on WebNLG,
representation ability of the knowledge base must be thereby achieving in 2019 a new state of the art for relation
carefully designed. To build a practical knowledge base extraction.
of this nature, we are currently engaged in researching
knowledge acquisition, representation, reasoning and
Figure 1 : Overview of GraphRel with 2-phase relation-weighted GCN.
66