Projekter pr. år
Abstract
Consider collections $\mathcal{A}$ and $\mathcal{B}$ of red and blue sets,respectively. Bichromatic Closest Pair is the problem of finding apair from $\mathcal{A}\times \mathcal{B}$ that has similarity higher than a given thresholdaccording to some similarity measure. Our focus here is the classic Jaccard similarity$|\textbf{a}\cap \textbf{b}|/|\textbf{a}\cup \textbf{b}|$ for $(\textbf{a},\textbf{b})\in \mathcal{A}\times \mathcal{B}$.
We consider the approximate version of the problem where we are given thresholds $j_1>j_2$ and wish to returna pair from $\mathcal{A}\times \mathcal{B}$ that has Jaccard similarity higher than$j_2$ if there exists a pair in $\mathcal{A}\times \mathcal{B}$ with Jaccard similarity atleast $j_1$. The classic locality sensitive hashing (LSH) algorithm of Indyk and Motwani (STOC '98), instantiated with the MinHash LSH function of Broder et al., solves this problem in $\tilde O(n^{2-\delta})$ time if $j_1\ge j_2^{1-\delta}$. In particular, for $\delta=\Omega(1)$, the approximation ratio $j_1/j_2=1/j_2^{\delta}$ increases polynomially in $1/j_2$.
In this paper we give a corresponding hardness result. Assuming theOrthogonal Vectors Conjecture (OVC), we show that there cannot be ageneral solution that solves the Bichromatic Closest Pair problem in $O(n^{2-\Omega(1)})$ timefor $j_1/j_2=1/j_2^{o(1)}$. Specifically, assuming OVC, we prove that forany $\delta>0$ there exists an $\varepsilon>0$ such that BichromaticClosest Pair with Jaccard similarity requires time$\Omega(n^{2-\delta})$ for any choice of thresholds$j_2<j_1<1-\delta$, that satisfy $j_1\le j_2^{1-\varepsilon}$.
We consider the approximate version of the problem where we are given thresholds $j_1>j_2$ and wish to returna pair from $\mathcal{A}\times \mathcal{B}$ that has Jaccard similarity higher than$j_2$ if there exists a pair in $\mathcal{A}\times \mathcal{B}$ with Jaccard similarity atleast $j_1$. The classic locality sensitive hashing (LSH) algorithm of Indyk and Motwani (STOC '98), instantiated with the MinHash LSH function of Broder et al., solves this problem in $\tilde O(n^{2-\delta})$ time if $j_1\ge j_2^{1-\delta}$. In particular, for $\delta=\Omega(1)$, the approximation ratio $j_1/j_2=1/j_2^{\delta}$ increases polynomially in $1/j_2$.
In this paper we give a corresponding hardness result. Assuming theOrthogonal Vectors Conjecture (OVC), we show that there cannot be ageneral solution that solves the Bichromatic Closest Pair problem in $O(n^{2-\Omega(1)})$ timefor $j_1/j_2=1/j_2^{o(1)}$. Specifically, assuming OVC, we prove that forany $\delta>0$ there exists an $\varepsilon>0$ such that BichromaticClosest Pair with Jaccard similarity requires time$\Omega(n^{2-\delta})$ for any choice of thresholds$j_2<j_1<1-\delta$, that satisfy $j_1\le j_2^{1-\varepsilon}$.
Originalsprog | Engelsk |
---|---|
Titel | LIPIcs - Leibniz International Proceedings in Informatics - 27th Annual European Symposium on Algorithms (ESA 2019) |
Antal sider | 13 |
Vol/bind | 144 |
Forlag | Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH |
Publikationsdato | 2019 |
Sider | 80:1–80:13 |
Artikelnummer | 80 |
ISBN (Elektronisk) | 978-3-95977-124-5 |
DOI | |
Status | Udgivet - 2019 |
Navn | Leibniz International Proceedings in Informatics |
---|---|
ISSN | 1868-8969 |
Emneord
- Bichromatic Closest Pair
- Jaccard Similarity
- Approximation Algorithms
- Locality Sensitive Hashing (LSH)
- Orthogonal Vectors Conjecture (OVC)
Fingeraftryk
Dyk ned i forskningsemnerne om 'Hardness of Bichromatic Closest Pair with Jaccard Similarity'. Sammen danner de et unikt fingeraftryk.Projekter
- 1 Afsluttet
-
BARC: Basic Algorithms Research Copenhagen
Pagh, R. (PI), Husfeldt, T. (CoI), Björklund, A. (CoI), Lebeda, C. J. (CoI), Nielsen, N. M. S. (CoI), Bercea, I. O. (CoI), Karppa, M. (CoI), Curticapean, R.-C. (CoI), Limaye, N. (CoI), Aumüller, M. (CoI), Sivertsen, J. V. T. (CoI), McCauley, S. (CoI) & Dell, H. (CoI)
01/09/2017 → 31/08/2024
Projekter: Projekt › Forskning