Charles Explorer logo
🇨🇿

Towards an Indonesian-English SMT System: A Case Study of an Under-Studied and Under-Resourced Language, Indonesian

Publikace na Matematicko-fyzikální fakulta |
2012

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This paper describes a work on preparing an Indonesian-English Statistical Machine Translation (SMT) System. It includes the creation of Indonesian morphological analyzer, MorphInd, and the composing of an Indonesian-English parallel corpus, IDENTIC.

We build an SMT system using the state-of-the-art phrase-based SMT system, MOSES. We show several scenarios where the morphological tool is used to incorporate morphological information in the SMT system trained with the composed parallel corpus.