Abstract
Approximate nearest neighbor search in high-dimensional metric spaces is crucial in modern data science pipelines. Efficient search algorithms often rely on partitioning the metric space. To find the approximate nearest neighbors of a query point, a candidate set is constructed based on the points that belong to the same part in the partition, and the closest points among these candidates are identified via a bruteforce search. Hyvönen et al. (JMLR, 2024) argue that viewing this problem as a multi-class labeling problem suggests that this traditional method is not optimal. Instead, they propose a “natural classifier” search strategy that incorporates the true labels of the candidate points, demonstrating faster searches and smaller candidate sets for the same accuracy for tree-based space partitioning methods. This paper explores the natural classifier and other search strategies for partitioning based on locality-sensitive hashing. We propose a new strategy that offers more precise control over the balance between performance and quality. Our analysis highlights the trade-offs between these methods, providing insights into optimizing search efficiency in various contexts
| Original language | English |
|---|---|
| Title of host publication | Similarity Search and Applications, SISAP 2024 |
| Publication date | 2024 |
| DOIs | |
| Publication status | Published - 2024 |
Keywords
- Approximate Nearest Neighbor Search
- Locality-Sensitive Hashing
- Similarity Search
- Search Strategies
Fingerprint
Dive into the research topics of 'An Empirical Evaluation of Search Strategies for Locality-Sensitive Hashing: Lookup, Voting, and Natural Classifier Search'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver