Similarity Search: Algorithms for Sets and other High Dimensional Data

Thomas Dybdahl Ahle

Research output: Book / Anthology / Report / Ph.D. thesisPh.D. thesis


We study five fundemental problems in the field of high dimensional algorithms,
similarity search and machine learning. We obtain many new results,
• Data structures with Las Vegas guarantees for `1-Approximate Near Neighbours
and Braun Blanquet Approximate Set Similarity Search with performance
matching state of the art Monte Carlo algorithms up to factors no(1).
• “Supermajorities:” The first space/time tradeoff for Approximate Set Similarity
Search, along with a number of lower bounds suggesting that the
algorithms are optimal among all Locality Sensitive Hashing (LSH) data
• The first lower bounds on approximative data structures based on the Orthogonal
Vectors Conjecture (OVC).
• Output Sensitive LSH data structures with query time near t + nr for outputting
t distinct elements, where nr is the state of the art time to output a
single element and tnr was the previous best for t items.
• A Johnson Lindenstrauss embedding with fast multiplication on tensors (a
Tensor Sketch), with high probability guarantees for subspace- and nearneighbour-embeddings.
Original languageEnglish
PublisherIT University of Copenhagen
Number of pages158
Publication statusPublished - 2019


Dive into the research topics of 'Similarity Search: Algorithms for Sets and other High Dimensional Data'. Together they form a unique fingerprint.

Cite this