## Abstract

Locality-sensitive hashing (LSH) is a fundamental technique for similarity search and similarity

estimation in high-dimensional spaces. The basic idea is that similar objects should produce hash

collisions with probability significantly larger than objects with low similarity. We consider LSH

for objects that can be represented as point sets in either one or two dimensions. To make the

point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. minwise hashing) to these point sets would require time proportional to the number of points. We

seek to achieve time that is much lower than direct approaches.

Technically, we introduce new primitives for range-efficient consistent sampling (of independent

interest), and show how to turn such samples into LSH values. Another application of our

technique is a data structure for quickly estimating the size of the intersection or union of a set

of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a

geometric problem.

estimation in high-dimensional spaces. The basic idea is that similar objects should produce hash

collisions with probability significantly larger than objects with low similarity. We consider LSH

for objects that can be represented as point sets in either one or two dimensions. To make the

point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. minwise hashing) to these point sets would require time proportional to the number of points. We

seek to achieve time that is much lower than direct approaches.

Technically, we introduce new primitives for range-efficient consistent sampling (of independent

interest), and show how to turn such samples into LSH values. Another application of our

technique is a data structure for quickly estimating the size of the intersection or union of a set

of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a

geometric problem.

Original language | English |
---|---|

Title of host publication | Proceedings of 28th International Symposium on Algorithms and Computation (ISAAC 2017) |

Number of pages | 15 |

Publisher | Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH |

Publication date | 2017 |

DOIs | |

Publication status | Published - 2017 |

Series | Leibniz International Proceedings in Informatics |
---|---|

ISSN | 1868-8969 |