Projects per year
Abstract
Localitysensitive hashing (LSH) is a fundamental technique for similarity search and similarity
estimation in highdimensional spaces. The basic idea is that similar objects should produce hash
collisions with probability significantly larger than objects with low similarity. We consider LSH
for objects that can be represented as point sets in either one or two dimensions. To make the
point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. minwise hashing) to these point sets would require time proportional to the number of points. We
seek to achieve time that is much lower than direct approaches.
Technically, we introduce new primitives for rangeefficient consistent sampling (of independent
interest), and show how to turn such samples into LSH values. Another application of our
technique is a data structure for quickly estimating the size of the intersection or union of a set
of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a
geometric problem.
estimation in highdimensional spaces. The basic idea is that similar objects should produce hash
collisions with probability significantly larger than objects with low similarity. We consider LSH
for objects that can be represented as point sets in either one or two dimensions. To make the
point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. minwise hashing) to these point sets would require time proportional to the number of points. We
seek to achieve time that is much lower than direct approaches.
Technically, we introduce new primitives for rangeefficient consistent sampling (of independent
interest), and show how to turn such samples into LSH values. Another application of our
technique is a data structure for quickly estimating the size of the intersection or union of a set
of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a
geometric problem.
Original language  English 

Title of host publication  Proceedings of 28th International Symposium on Algorithms and Computation (ISAAC 2017) 
Number of pages  15 
Publisher  Schloss Dagstuhl  LeibnizZentrum fuer Informatik GmbH 
Publication date  2017 
DOIs  
Publication status  Published  2017 
Series  Leibniz International Proceedings in Informatics 

ISSN  18688969 
Keywords
 Localitysensitive hashing
 Similarity search
 Highdimensional spaces
 Consistent sampling
 Geometric transformation
Fingerprint
Dive into the research topics of 'Rangeefficient consistent sampling and localitysensitive hashing for polygons'. Together they form a unique fingerprint.Projects
 1 Finished

SSS: Scalable Similarity Search
Pagh, R. (PI), Christiani, T. L. (CoI), Pham, N. D. (CoI), Faithfull, A. (CoI), Silvestri, F. (CoI), Mikkelsen, J. W. (CoI), Sivertsen, J. V. T. (CoI), Aumüller, M. (CoI), Skala, M. (CoI), Ceccarello, M. (CoI), Themsen, R. (CoI), Jacob, R. (CoI), McCauley, S. (CoI) & Ahle, T. D. (CoI)
01/05/2014 → 30/04/2019
Project: Research