Measuring the Difficulty of Distance-Based Indexing

Matthew Skala

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
OriginalsprogEngelsk
TitelProceedings of the 12th International Conference on String Processing and Information Retrieval (SPIRE 2005), Buenos Aires, Argentina, November 2--4, 2005
Antal sider12
Vol/bind3772
ForlagSpringer London
Publikationsdato2005
Sider103-114
StatusUdgivet - 2005
Udgivet eksterntJa

Emneord

  • Similarity search
  • Vector spaces
  • Distance-based data structures
  • Intrinsic dimensionality
  • Performance prediction

Fingeraftryk

Dyk ned i forskningsemnerne om 'Measuring the Difficulty of Distance-Based Indexing'. Sammen danner de et unikt fingeraftryk.

Citationsformater