Range-efficient consistent sampling and locality-sensitive hashing for polygons

Joachim Gudmundsson, Rasmus Pagh

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Locality-sensitive hashing (LSH) is a fundamental technique for similarity search and similarity
estimation in high-dimensional spaces. The basic idea is that similar objects should produce hash
collisions with probability significantly larger than objects with low similarity. We consider LSH
for objects that can be represented as point sets in either one or two dimensions. To make the
point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. minwise hashing) to these point sets would require time proportional to the number of points. We
seek to achieve time that is much lower than direct approaches.

Technically, we introduce new primitives for range-efficient consistent sampling (of independent
interest), and show how to turn such samples into LSH values. Another application of our
technique is a data structure for quickly estimating the size of the intersection or union of a set
of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a
geometric problem.
Original languageEnglish
Title of host publicationProceedings of 28th International Symposium on Algorithms and Computation (ISAAC 2017)
Number of pages15
PublisherSchloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH
Publication date2017
DOIs
Publication statusPublished - 2017
SeriesLeibniz International Proceedings in Informatics
ISSN1868-8969

Keywords

  • Locality-sensitive hashing
  • Similarity search
  • High-dimensional spaces
  • Consistent sampling
  • Geometric transformation

Fingerprint

Dive into the research topics of 'Range-efficient consistent sampling and locality-sensitive hashing for polygons'. Together they form a unique fingerprint.
  • SSS: Scalable Similarity Search

    Pagh, R. (PI), Christiani, T. L. (CoI), Pham, N. D. (CoI), Faithfull, A. (CoI), Silvestri, F. (CoI), Mikkelsen, J. W. (CoI), Sivertsen, J. V. T. (CoI), Aumüller, M. (CoI), Skala, M. (CoI), Ceccarello, M. (CoI), Themsen, R. (CoI), Jacob, R. (CoI), McCauley, S. (CoI) & Ahle, T. D. (CoI)

    European Commission

    01/05/201430/04/2019

    Project: Research

Cite this