Neural networks are the state-of-the-art method of machine learning for many problems in NLP. Their success in machine translation and other NLP tasks is phenomenal, but their interpretability is challenging.
We want to find out how neural networks represent meaning. In order to do this, we propose to examine the distribution of meaning in the vector space representation of words in neural networks trained for NLP tasks.
Furthermore, we propose to consider various theories of meaning in the philosophy of language and to find a methodology that would enable us to connect these areas.