Previous [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16]


Journal of Information Science and Engineering, Vol. 32 No. 6, pp. 1697-1710 (November 2016)


Global Source-Aware Statistical Post-Editing for General MT: Sentence Specification via Pseudo-Feedback


JUNGUO ZHU, MUYUN YANG, TIEJUN ZHAO AND SHENG LI
School of Computer Science and Technology
Harbin Institute of Technology
Harbin, 150001 P.R. China
E-mail: {jgzhu; ymy}@mtlab.hit.edu.cn; {tjzhao; lisheng}@hit.edu.cn

The automatic post-editing (APE), which can correct the translation errors, is an effective approach to improving machine translation (MT) output quality. This paper proposes a global source-aware SPE model to improve the MT translation quality leveraging pseudo-feedback to achieve the sentence specification. For a given source sentence, some similar sentences are retrieved from a translation memory (TM) as the post-editing data. The data is a set of tri-lingual parallel texts which contain the source sentences and their raw machine translations and their gold references (human translations). The alignments between the raw translation and the references are used to re-examine effectiveness of post-editing phrase pairs of the source-independent SPE model. The selected phrase pairs are applied to polish the raw translations. The experimental results show that our method brings the improvement of 3.78 BLEU score to the original outputs of Google translation, outperforms a source-independent SPE model by 1.09 BLEU points and a local source-aware SPE model by 1.02 BLEU points.

Keywords: machine translation, translation quality, post editing, pseudo-feedback, source-aware

Full Text () Retrieve PDF document (201611_16.pdf)

Received August 10, 2015; revised January 6, 2016; accepted February 18, 2016.
Communicated by Chao-Lin Liu.