Charles Explorer logo

Adding Words to Manuscripts: From PagesXML to TEITOK

Publication at Faculty of Mathematics and Physics |


This article describes a two-step method for transcribing historic manuscripts. In this method, the first step uses a page-based representation making it easy to transcribe the document page-by-page and line-by-line, while the second step converts this to the TEI/XML text-based format, in order to make sure the document becomes fully searchable.