Charles Explorer logo
🇨🇿

Challenges in Accessing Information in Digitized 19th-Century Czech Texts

Publikace na Filozofická fakulta |
2012

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This short paper describes problems arising in optical character recognition of and information retrieval from historical texts in languages with rich morphology, rather discontinuous lexical development and a long history of spelling reforms. In a work-inprogress manner, the problems and proposed linguistic solutions are shown on the example of the current project focused on improving the access to digitized Czech prints from the 19th century and the first half of the 20th century.