Charles Explorer logo
🇬🇧

ParCzech PS7 1.0

Publication

Abstract

The ParCzech PS7 1.0 corpus is the very first member of the corpus family of data coming from the Parliament of the Czech Republic. ParCzech PS7 1.0 consists of stenographic protocols that record the Chamber of Deputies' meetings held in the 7th term between 2013-2017.

The audio recordings are available as well. Transcripts are provided in the original HTML as harvested, and also converted into TEI-derived XML format for use in TEITOK corpus manager.

The corpus is automatically enriched with the morphological and named-entity annotations using the procedures MorphoDita and NameTag.