Scalability and Total Recall with Fast CoveringLSH

Ninh Dang Pham, Rasmus Pagh

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Locality-sensitive hashing (LSH) has emerged as the dominant algorithmic technique for similarity search with strong performance guarantees in high-dimensional spaces. A drawback of traditional LSH schemes is that they may have false negatives, i.e., the recall is less than 100%. This limits the applicability of LSH in settings requiring precise performance guarantees. Building on the recent theoretical "CoveringLSH" construction that eliminates false negatives, we propose a fast and practical covering LSH scheme for Hamming space called Fast CoveringLSH (fcLSH). Inheriting the design benefits of CoveringLSH our method avoids false negatives and always reports all near neighbors. Compared to CoveringLSH we achieve an asymptotic improvement to the hash function computation time from O(dL) to O(d + (LlogL), where d is the dimensionality of data and L is the number of hash tables. Our experiments on synthetic and real-world data sets demonstrate that fcLSH is comparable (and often superior) to traditional hashing-based approaches for search radius up to 20 in high-dimensional Hamming space.
Original languageEnglish
Title of host publicationProceedings of the 25th ACM International on Conference on Information and Knowledge Management : CIKM '16
PublisherAssociation for Computing Machinery
Publication date2016
Pages1109-1118
ISBN (Electronic)978-1-4503-4073-1
DOIs
Publication statusPublished - 2016

Keywords

  • Locality-sensitive hashing
  • High-dimensional similarity search
  • Hamming space
  • False negatives
  • Hash function computation

Fingerprint

Dive into the research topics of 'Scalability and Total Recall with Fast CoveringLSH'. Together they form a unique fingerprint.
  • SSS: Scalable Similarity Search

    Pagh, R. (PI), Christiani, T. L. (CoI), Pham, N. D. (CoI), Faithfull, A. (CoI), Silvestri, F. (CoI), Mikkelsen, J. W. (CoI), Sivertsen, J. V. T. (CoI), Aumüller, M. (CoI), Skala, M. (CoI), Ceccarello, M. (CoI), Themsen, R. (CoI), Jacob, R. (CoI), McCauley, S. (CoI) & Ahle, T. D. (CoI)

    European Commission

    01/05/201430/04/2019

    Project: Research

Cite this