Charles Explorer logo
🇬🇧

Named Entities in Czech: Annotating Data and Developing NE Tagger

Publication at Faculty of Mathematics and Physics |
2009

Abstract

This paper deals with the treatment of Named Entities (NEs) in Czech. We introduce a two-level NE classification.

We have used this classification for manual annotation of two thousand sentences, gaining more than 11,000 NE instances. Employing the annotated data and Machine-Learning techniques (namely the top-down induction of decision trees), we have developed and evaluated a software system aimed at automatic detection and classification of NEs in Czech texts.