Charles Explorer logo
🇬🇧

Nonmetric Similarity Search Problems in Very Large Collections

Publication at Faculty of Mathematics and Physics |
2011

Abstract

Similarity search is a fundamental problem in many disciplines like multimedia databases, data mining, bioinformatics, computer vision, and pattern recognition, among others. The standard approach for implementing similarity search is to define a dissimilarity measure that satisfies the properties of a metric (strict positiveness, symmetry, and the triangle inequality), and then use it to query for similar objects in large data collections.

The advantage of this approach is that there are many index structures (so-called metric access methods) that can be used to efficiently perform the queries. However, a recent survey [91] has shown that similarity measures not holding the metric properties have been widely used for content-based retrieval, because these (usually) more complex similarity measures are more effective and give better results.