## Abstract

We study the problem of approximate near neighbor (ANN) search and show the following results:

● An improved framework for solving the ANN problem using locality-sensitive hashing, reducing the number of evaluations of locality-sensitive hash functions and the word-RAM complexity compared to the standard framework.

● A framework for solving the ANN problem with space-time tradeoffs as well as tight upper and lower bounds for the space-time tradeoff of framework solutions to the ANN problem under cosine similarity.

● A novel approach to solving the ANN problem on sets along with a matching lower bound, improving the state of the art. A self-tuning version of the algorithm is shown through experiments to outperform existing similarity join algorithms.

● Tight lower bounds for asymmetric locality-sensitive hashing which has applications to the approximate furthest neighbor problem, orthogonal vector search, and annulus queries.

● A proof of the optimality of a well-known Boolean locality-sensitive hashing scheme.

We study the problem of efficient algorithms for producing high-quality pseudorandom numbers and obtain the following results:

● A deterministic algorithm for generating pseudorandom numbers of arbitrarily high quality in constant time using near-optimal space.

● A randomized construction of a family of hash functions that outputs pseudorandom numbers of arbitrarily high quality with space usage and running time nearly matching known cell-probe lower bounds.

● An improved framework for solving the ANN problem using locality-sensitive hashing, reducing the number of evaluations of locality-sensitive hash functions and the word-RAM complexity compared to the standard framework.

● A framework for solving the ANN problem with space-time tradeoffs as well as tight upper and lower bounds for the space-time tradeoff of framework solutions to the ANN problem under cosine similarity.

● A novel approach to solving the ANN problem on sets along with a matching lower bound, improving the state of the art. A self-tuning version of the algorithm is shown through experiments to outperform existing similarity join algorithms.

● Tight lower bounds for asymmetric locality-sensitive hashing which has applications to the approximate furthest neighbor problem, orthogonal vector search, and annulus queries.

● A proof of the optimality of a well-known Boolean locality-sensitive hashing scheme.

We study the problem of efficient algorithms for producing high-quality pseudorandom numbers and obtain the following results:

● A deterministic algorithm for generating pseudorandom numbers of arbitrarily high quality in constant time using near-optimal space.

● A randomized construction of a family of hash functions that outputs pseudorandom numbers of arbitrarily high quality with space usage and running time nearly matching known cell-probe lower bounds.

Originalsprog | Engelsk |
---|

Forlag | IT-Universitetet i København |
---|---|

Antal sider | 243 |

ISBN (Trykt) | 978-87-7949012-3 |

Status | Udgivet - 2018 |

Navn | ITU-DS |
---|---|

Nummer | 145 |

ISSN | 1602-3536 |