Charles Explorer logo

Analysis of MultiWord Expression translation errors in Statistical Machine Translation

Publication at Faculty of Mathematics and Physics |


In this paper, we analyse the usage of multiword expressions in Statistical Machine Translation. We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs.

For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data.

For Czech-Russian pair, we used just one method - adding automatically extracted data as a parallel corpus.