We present the Charles University system for the MRL 2023 Shared Task on Multi-lingual Multi-task Information Retrieval.
The goal of the shared task was to develop systems for named entity recognition and question answering in several under-represented languages.
Our solutions to both subtasks rely on the translate-test approach.
We first translate the unlabeled examples into English using a multilingual machine translation model.
Then, we run inference on the translated data using a strong task-specific model.
Finally, we project the labeled data back into the original language.
To keep the inferred tags on the correct positions in the original language, we propose a method based on scoring the candidate positions using a label-sensitive translation model.
In both settings, we experiment with finetuning the classification models on the translated data.
However, due to a domain mismatch between the development data and the shared task validation and test sets, the finetuned models could not outperform our baselines.