Charles Explorer logo
🇨🇿

Designing Similarity Indexes with Parallel Genetic Programming

Publikace na Matematicko-fyzikální fakulta |
2013

Tento text není v aktuálním jazyce dostupný. Zobrazuje se verze "en".Abstrakt

The increasing diversity of unstructured databases leads to the development of advanced indexing techniques as the metric indexing model does not fit to the general similarity models. Once the most critical postulate, namely the triangle inequality, does not hold, the metric model produces notable errors during the query evaluation.

To overcome this situation and to obtain more qualitative results, we want to discover better indexing models for databases using arbitrary similarity measures. However, each database is unique in a specific way, so we outline the automatic way of exploring the best indexing method.

We introduce the exploration approach using parallel genetic programming principles in a multi-threaded environment built upon recently introduced SIMDEX Framework. Furthermore, we introduce \emph{smart pivot table} which is an intelligent indexing method capable of incorporating obtained results.

We supplement the theoretical background with experiments showing the achieved improvements in comparison to the single-threaded evaluations.