A reference set of collocation candidates extracted as surface bigrams from the Czech National Corpus and annotated as collocational or non-collocational.