Sequence labeling-based reordering model for phrase-based SMT

Minwei Feng; Jan-Thorsten Peter; Hermann Ney

Sequence labeling-based reordering model for phrase-based SMT

Minwei Feng, Jan-Thorsten Peter, Hermann Ney

Abstract

For current statistical machine translation system, reordering is still a major problem for language pairs like Chinese-English, where the source and target language have significant word order differences. In this paper, we propose a novel reordering model based on sequence labeling techniques. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. For the given source sentence, we assign each source token a label which contains the reordering information for that token. We also design an unaligned word tag so that the unaligned word phenomenon is automatically implanted in the proposed model. Our reordering model is conditioned on the whole source sentence. Hence it is able to catch the long dependency in the source sentence. Although the learning on large scale task requests notably amounts of computational resources, the decoder makes use of the tagging information as soft constraints. Therefore, the training procedure of our model is computationally expensive for large task while in the test phase (during translation) our model is very efficient. We carried out experiments on five Chinese-English NIST tasks trained with BOLT data. Results show that our model improves the baseline system by 1.32 BLEU 1.53 TER on average.

Anthology ID:: 2012.iwslt-papers.16
Volume:: Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
Month:: December 6-7
Year:: 2012
Address:: Hong Kong, Table of contents
Venue:: IWSLT
SIG:: SIGSLT
Publisher:
Note:
Pages:: 260–267
Language:
URL:: https://aclanthology.org/2012.iwslt-papers.16/
DOI:
Bibkey:
Cite (ACL):: Minwei Feng, Jan-Thorsten Peter, and Hermann Ney. 2012. Sequence labeling-based reordering model for phrase-based SMT. In Proceedings of the 9th International Workshop on Spoken Language Translation: Papers, pages 260–267, Hong Kong, Table of contents.
Cite (Informal):: Sequence labeling-based reordering model for phrase-based SMT (Feng et al., IWSLT 2012)
Copy Citation:
PDF:: https://aclanthology.org/2012.iwslt-papers.16.pdf

PDF Cite Search Fix data