We investigate adaptation of a supervised machine learning model for reranking of query translations to new languages in the context of cross-lingual information retrieval. The model is trained to rerank multiple translations produced by a statistical machine translation system and optimize retrieval quality.
The model features do not depend on the source language and thus allow the model to be trained on query translations coming from multiple languages. In this paper, we explore how this affects the final retrieval quality.
The experiments are conducted on medical-domain test collection in English and multilingual queries (in Czech, German, French) from the CLEF eHealth Lab series 2013--2015. We adapt our method to allow reranking of query translations for four new languages (Spanish, Hungarian, Polish, Swedish).
The baseline approach, where a single model is trained for each source language on query translations from that language, is compared with a model co-trained on translations from the three