Large-Scale Similarity Joins With Guarantees

Rasmus Pagh

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

The ability to handle noisy or imprecise data is becoming increasingly important in computing. In the database community the notion of similarity join has been studied extensively, yet existing solutions have offered weak performance guarantees. Either they are based on deterministic filtering techniques that often, but not always, succeed in reducing computational costs, or they are based on randomized techniques that have improved guarantees on computational cost but come with a probability of not returning the correct result. The aim of this paper is to give an overview of randomized techniques for high-dimensional similarity search, and discuss recent advances towards making these techniques more widely applicable by eliminating probability of error and improving the locality of data access.
OriginalsprogEngelsk
Titel18th International Conference on Database Theory (ICDT 2015)
Vol/bind31
Publikationsdato2015
Sider15-24
ISBN (Elektronisk)978-3-939897-79-8
StatusUdgivet - 2015
NavnLeibniz International Proceedings in Informatics
ISSN1868-8969

Emneord

  • Similarity Join
  • Randomized Techniques
  • High-Dimensional Data
  • Computational Guarantees
  • Locality of Data Access

Fingeraftryk

Dyk ned i forskningsemnerne om 'Large-Scale Similarity Joins With Guarantees'. Sammen danner de et unikt fingeraftryk.
  • SSS: Scalable Similarity Search

    Pagh, R. (PI), Christiani, T. L. (CoI), Pham, N. D. (CoI), Faithfull, A. (CoI), Silvestri, F. (CoI), Mikkelsen, J. W. (CoI), Sivertsen, J. V. T. (CoI), Aumüller, M. (CoI), Skala, M. (CoI), Ceccarello, M. (CoI), Themsen, R. (CoI), Jacob, R. (CoI), McCauley, S. (CoI) & Ahle, T. D. (CoI)

    European Commission

    01/05/201430/04/2019

    Projekter: ProjektForskning

Citationsformater