Page 69 - My FlipBook
P. 69
Brochure 2020
Another element of our research work on knowledge text. Treating FN input as non-relation training sentences
acquisition in 2019 pertained to the development of an can diminish final model performance. To overcome this
effective approach for removing noisy samples under problem, we generated H-FND, a hierarchical false-negative
distant supervision. Most machine-learning techniques denoising framework for robust relation extraction (see
require a set of training data, but labeling training data Figure 2). H-FND uses a hierarchical protocol that first
is expensive both in terms of time and money. Distant determines if non-relation (NA) sentences should be
supervision represents an alternative approach to kept, discarded, or revised during training. Then, those
generating training data. In distant supervision, an existing NA sentences are revised into appropriate relations for
database is used to collect examples for the relation we better training input. We have conducted experiments
want to extract, even though that process results in a on SemEval-2010 and randomly ltered ratios of training/
large amount of noisy training data being generated. validation sentences into NA. Our results show that H-FND
Thus, distant supervision is vulnerable to false-negative revises FN sentences correctly and maintains high F1 scores
(FN) sentences, especially when extracting relations from even when 50% of the sentences have been ltered out.
Figure 2 : H-FND framework. The process in this diagram is executed per epoch.
67
Another element of our research work on knowledge text. Treating FN input as non-relation training sentences
acquisition in 2019 pertained to the development of an can diminish final model performance. To overcome this
effective approach for removing noisy samples under problem, we generated H-FND, a hierarchical false-negative
distant supervision. Most machine-learning techniques denoising framework for robust relation extraction (see
require a set of training data, but labeling training data Figure 2). H-FND uses a hierarchical protocol that first
is expensive both in terms of time and money. Distant determines if non-relation (NA) sentences should be
supervision represents an alternative approach to kept, discarded, or revised during training. Then, those
generating training data. In distant supervision, an existing NA sentences are revised into appropriate relations for
database is used to collect examples for the relation we better training input. We have conducted experiments
want to extract, even though that process results in a on SemEval-2010 and randomly ltered ratios of training/
large amount of noisy training data being generated. validation sentences into NA. Our results show that H-FND
Thus, distant supervision is vulnerable to false-negative revises FN sentences correctly and maintains high F1 scores
(FN) sentences, especially when extracting relations from even when 50% of the sentences have been ltered out.
Figure 2 : H-FND framework. The process in this diagram is executed per epoch.
67