## Abstract

We study the problem of approximate near neighbor (ANN) search

and show the following results:

• An improved framework for solving the ANN problem using

locality-sensitive hashing, reducing the number of evaluations

of locality-sensitive hash functions and the word-RAM complexity compared to the standard framework.

• A framework for solving the ANN problem with space-time

tradeoffs as well as tight upper and lower bounds for the spacetime tradeoff of framework solutions to the ANN problem

under cosine similarity.

• A novel approach to solving the ANN problem on sets along

with a matching lower bound, improving the state of the

art. A self-tuning version of the algorithm is shown through

experiments to outperform existing similarity join algorithms.

• Tight lower bounds for asymmetric locality-sensitive hashing

which has applications to the approximate furthest neighbor

problem, orthogonal vector search, and annulus queries.

• A proof of the optimality of a well-known Boolean localitysensitive hashing scheme.

We study the problem of efficient algorithms for producing highquality pseudorandom numbers and obtain the following results:

• A deterministic algorithm for generating pseudorandom numbers of arbitrarily high quality in constant time using nearoptimal space.

• A randomized construction of a family of hash functions that

outputs pseudorandom numbers of arbitrarily high quality

with space usage and running time nearly matching known

cell-probe lower bounds.

and show the following results:

• An improved framework for solving the ANN problem using

locality-sensitive hashing, reducing the number of evaluations

of locality-sensitive hash functions and the word-RAM complexity compared to the standard framework.

• A framework for solving the ANN problem with space-time

tradeoffs as well as tight upper and lower bounds for the spacetime tradeoff of framework solutions to the ANN problem

under cosine similarity.

• A novel approach to solving the ANN problem on sets along

with a matching lower bound, improving the state of the

art. A self-tuning version of the algorithm is shown through

experiments to outperform existing similarity join algorithms.

• Tight lower bounds for asymmetric locality-sensitive hashing

which has applications to the approximate furthest neighbor

problem, orthogonal vector search, and annulus queries.

• A proof of the optimality of a well-known Boolean localitysensitive hashing scheme.

We study the problem of efficient algorithms for producing highquality pseudorandom numbers and obtain the following results:

• A deterministic algorithm for generating pseudorandom numbers of arbitrarily high quality in constant time using nearoptimal space.

• A randomized construction of a family of hash functions that

outputs pseudorandom numbers of arbitrarily high quality

with space usage and running time nearly matching known

cell-probe lower bounds.

Original language | English |
---|

Publisher | IT-Universitetet i København |
---|---|

Number of pages | 243 |

ISBN (Print) | 978-87-7949012-3 |

Publication status | Published - 2018 |

Series | ITU-DS |
---|---|

Number | 145 |

ISSN | 1602-3536 |