The high performance of large pretrained language models (LLMs) such as BERT (Devlin et al., 2019) on NLP tasks has prompted questions about BERT's linguistic capabilities, and how they differ from humans'. In this paper, we approach this question by examining BERT's knowledge of lexical semantic relations.
We focus on hypernymy, the "is-a" relation that relates a word to a superordinate category. We use a prompting methodology to simply ask BERT what the hypernym of a given word is.
We find that, in a setting where all hypernyms are guessable via prompting, BERT knows hypernyms with up to 57% accuracy. Moreover, BERT with prompting outperforms other unsupervised models for hypernym discovery even in an unconstrained scenario.
However, BERT's predictions and performance on a dataset containing uncommon hyponyms and hypernyms indicate that its knowledge of hypernymy is still limited.