Permutation Based Indexing for High Dimensional Data on GPU Architectures

Publikace na Matematicko-fyzikální fakulta |

2015

Abstrakt

Permutation-based indexing is one of the most popular techniques for the approximate nearest-neighbor search problem in high-dimensional spaces. Due to the exponential increase of multimedia data, the time required to index this data has become a serious constraint of the indexing techniques.

One of the possible steps towards faster index construction is utilization of massively parallel platforms such as the GPGPU architectures. In this paper, we have analyzed the computational costs of individual steps of the permutation-based index construction in a high-dimensional feature space and proposed a hybrid solution, where computational power of GPU is utilized for distance computations whilst the host CPU performs the postprocessing and sorting steps.

Despite the fact that computing the distances is a naturally data-parallel task, an efficient implementation is quite challenging due to various GPU limitations and complex memory hierarchy. We have tested possible approaches to work division and data caching to utilize the GPU to its best abilities.

We summarize our empirical results and point out the optimal solution.

Klíčová slova

Permutation indexing High Dimensional Data GPU parallel