Charles Explorer logo
🇨🇿

Překlad krátkých segmentů pomocí NMT: Případová studie pro překlad z angličtiny do hindštiny

Publikace na Matematicko-fyzikální fakulta |
2018

Abstrakt

This paper presents a case study in translating short image captions of the Visual Genome dataset from English into Hindi using out-of-domain data sets of varying size. We experiment with three NMT models: the shallow and deep sequence-to-sequence and the Transformer model as implemented in Marian toolkit.

Phrase-based Moses serves as the baseline. The results indicate that the Transformer model outperforms others in the large data setting in a number of automatic metrics and manual evaluation, and it also produces the fewest truncated sentences.

Transformer training is however very sensitive to the hyperparameters, so it requires more experimenting.