Analysis of MultiWord Expression translation errors in Statistical Machine Translation

Publication at Faculty of Mathematics and Physics |

2015

Abstract

In this paper, we analyse the usage of multiword expressions in Statistical Machine Translation. We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs.

For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data.

For Czech-Russian pair, we used just one method - adding automatically extracted data as a parallel corpus.

Keywords

analysis multiword expression translation errors statistical machine translation