Charles Explorer logo
🇨🇿

Extraction and Interpretation of Textual Data from Czech Insolvency Proceedings

Publikace na Matematicko-fyzikální fakulta |
2017

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

Recently, the Czech Insolvency Register covers about 200 000 insolvency proceedings commenced since 2008. To each respective insolvency proceeding, several scanned document copies can be attached (i.e., cca 1200000 pdf-files in all).

This study aims at finding efficient pre-processing, clustering and classification techniques capable of extracting valid information on the indebtedness structure across the Czech society from the above-mentioned pdf-files.