Abstract
Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
| Originalsprog | Engelsk |
|---|---|
| Titel | Proceedings of the 12th International Conference on String Processing and Information Retrieval (SPIRE 2005), Buenos Aires, Argentina, November 2--4, 2005 |
| Antal sider | 12 |
| Vol/bind | 3772 |
| Forlag | Springer |
| Publikationsdato | 2005 |
| Sider | 103-114 |
| Status | Udgivet - 2005 |
| Udgivet eksternt | Ja |
Emneord
- Similarity search
- Vector spaces
- Distance-based data structures
- Intrinsic dimensionality
- Performance prediction