I/O-efficient Similarity Join

Rasmus Pagh, Ninh Dang Pham, Francesco Silvestri, Morten Danmark Stöckel

Publikation: Artikel i tidsskrift og konference artikel i tidsskriftTidsskriftartikelForskningpeer review

Abstract

We present an I/O-efficient algorithm for computing similarity joins based on locality-sensitive hashing (LSH). In contrast to the filtering methods commonly suggested our method has provable subquadratic dependency on the data size. Further, in contrast to straightforward implementations of known LSH-based algorithms on external memory, our approach is able to take significant advantage of the available internal memory:
OriginalsprogEngelsk
TidsskriftAlgorithmica
Vol/bind78
ISSN0178-4617
DOI
StatusUdgivet - 2017

Emneord

  • cache oblivious
  • cache aware
  • locality sensitive hashing
  • Similarity join

Fingeraftryk

Dyk ned i forskningsemnerne om 'I/O-efficient Similarity Join'. Sammen danner de et unikt fingeraftryk.
  • SSS: Scalable Similarity Search

    Pagh, R. (PI), Christiani, T. L. (CoI), Pham, N. D. (CoI), Faithfull, A. (CoI), Silvestri, F. (CoI), Mikkelsen, J. W. (CoI), Sivertsen, J. V. T. (CoI), Aumüller, M. (CoI), Skala, M. (CoI), Ceccarello, M. (CoI), Themsen, R. (CoI), Jacob, R. (CoI), McCauley, S. (CoI) & Ahle, T. D. (CoI)

    European Commission

    01/05/201430/04/2019

    Projekter: ProjektForskning

Citationsformater