Measuring the Difficulty of Distance-Based Indexing

Matthew Skala

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review


Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
TitelProceedings of the 12th International Conference on String Processing and Information Retrieval (SPIRE 2005), Buenos Aires, Argentina, November 2--4, 2005
Antal sider12
ForlagSpringer London
StatusUdgivet - 2005
Udgivet eksterntJa


Dyk ned i forskningsemnerne om 'Measuring the Difficulty of Distance-Based Indexing'. Sammen danner de et unikt fingeraftryk.