Charles Explorer logo
🇨🇿

Combining association measures for lexical classification using clustering of receiver operating characteristic curves

Publikace na Matematicko-fyzikální fakulta |
2013

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

This paper focuses on combining association measures using corresponding receiver operating characteristic curves. The approach is motivated by a problem of automatic bigram collocation extraction from the field of computational linguistics.

It is based on supervised machine learning techniques and the fact that different association measures discover different collocation types. Clusters of equivalent ROC curves are first determined by a testing procedure.

The paper's major contribution is an investigation of the possibility of combining representatives of the clusters of equivalent association measures into more complex models, thus improving performance of the collocation extraction.