Abstract
While approximate nearest-neighbor (ANN) search is an integral part of the modern multimedia analytics pipeline, the ever-hungry AI-models may frequently starve the ANN structures of resources, particularly memory. We present a novel white-box implementation of the disk-based hierarchical eCP index, called eCP-FS, that extends the disk-based strategy beyond merely storing data on disk, and instead implements the entire structure as an overlay file system using Zarr. This maps the (normally complex) index structure intuitively to a familiar hierarchical folder structure, which in turn makes the index much easier to visualize and analyse than the typical in-memory black-box structures of other algorithms. We furthermore implement incremental retrieval over eCP-FS, which benefits even more from file-system caching. Using an experimental benchmark inspired by live retrieval competitions, we show that despite trading raw speed for reduced memory footprint, eCP-FS is still a competitive option in the modern day analytics pipeline.
| Original language | English |
|---|---|
| Title of host publication | Similarity Search and Applications |
| Editors | Giuseppe Amato, Vladimir Mic, Agma Traina, Nicola Messina, Laurent Amsaleg, Gylfi Þór Guðmundsson, Björn Þór Jónsson, Lucia Vadicamo |
| Number of pages | 9 |
| Place of Publication | Cham |
| Publisher | Springer Nature Switzerland |
| Publication date | 8 Oct 2025 |
| Pages | 303-311 |
| ISBN (Print) | 978-3-032-06068-6 |
| ISBN (Electronic) | 978-3-032-06069-3 |
| DOIs | |
| Publication status | Published - 8 Oct 2025 |
| Externally published | Yes |
| Event | Similarity Search and Applications - Bologna, Italy Duration: 5 Oct 2022 → 7 Oct 2022 Conference number: 15th |
Conference
| Conference | Similarity Search and Applications |
|---|---|
| Number | 15th |
| Country/Territory | Italy |
| City | Bologna |
| Period | 05/10/2022 → 07/10/2022 |
| Series | LNCS |
|---|---|
| Volume | 16134 |
Keywords
- High-dimensional indexing
- Resource Constrained Search
- Incremental Retrieval
- Disk-based ANN
Fingerprint
Dive into the research topics of 'The Curious Case of High-Dimensional Indexing as a File Structure: A Case Study of eCP-FS'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver