The paper deals with harmonisation of existing data resources containing word-formation features by converting them into a common file format and partially aligning their annotation schemas. We summarise (dis)similarities between the resources and describe individual steps of the harmonisation procedure, including manual annotations and application of Machine Learning techniques.
The resulting 'Universal Derivations 1.0' collection contains 27 harmonised resources covering 20 languages. It is publicly available in the LINDAT/CLARIAH CZ repository and can be queried via the DeriSearch tool.