This paper provides the system description of “Silo NLP's” submission to the Workshop on Asian Translation (WAT2022). We have participated in the Indic Multimodal tasks (English-\textgreaterHindi, English-\textgreaterMalayalam, and English-\textgreaterBengali, Multimodal Translation).
For text-only translation, we used the Transformer and fine-tuned the mBART. For multimodal translation, we used the same architecture and extracted object tags from the images to use as visual features concatenated with the text sequence for input.
Our submission tops many tasks including English-\textgreaterHindi multimodal translation (evaluation test), English-\textgreaterMalayalam text-only and multimodal translation (evaluation test), English-\textgreaterBengali multimodal translation (challenge test), and English-\textgreaterBengali text-only translation (evaluation test).