Predictive performance comparisons of different feature extraction methods in a financial column corpus

Publication

Abstract

This contribution concerns the treatment of a corpus consisting of a weekly financial rubric. In particular, we focused on extracting document-level indexes and extracting textual variables.

Furthermore, we compared some variable extraction methods to evaluate their predictive ability. The results confirm the hypothesis that the vectors derived from word embedding do not improve the predictive capacity compared to other variable extraction methods, but remain a fundamental resource for understanding the semantics in the texts

Keywords

NLP information retrieval Italian