Data Extraction Using NLP Techniques and Its Transformation to Linked Data

Publication at Faculty of Mathematics and Physics |

2014

Abstract

We present a system that extracts a knowledge base from raw unstructured texts that is designed as a set of entities and their relations and represented in an ontological framework. The extraction pipeline processes input texts by linguistically-aware tools and extracts entities and relations from their syntactic representation.

Consequently, the extracted data is represented according to the Linked Data principles. The system is designed both domain and language independent and provides users with data for more intelligent search than full-text search.

We present our first case study on processing Czech legal texts.

Keywords

data extraction using techniques transformation linked data