Unsupervised Word Sense Disambiguation Using Word Embeddings

Publication at Faculty of Mathematics and Physics |

2019

Abstract

Word sense disambiguation is the task of assigning the correct sense of a polysemous word in the context in which it appears. In recent years, word embeddings have been applied successfully to many NLP tasks.

Thanks to their ability to capture distributional semantics, more recent attention have been focused on utilizing word embeddings to disambiguate words. In this paper, a novel unsupervised method is proposed to disambiguate words from the first language by deploying a trained word embeddings model of the second language using only a bilingual dictionary.

While the translated words are useful clues for the disambiguation process, the main idea of this work is to use the information provided by English-translated surrounding words to disambiguate Persian words using trained English word2vec; well-known word embeddings model. Each translation of the polysemous word is compared against word embeddings of translated surrounding words to calculate word similarity scores and the most similar word to vec

Keywords

unsupervised word sense disambiguation using word embeddings