## Abstract

This work shows how to privately and more accurately estimate Euclidean distance between pairs of vectors. Input vectors $x$ and $y$ are mapped to differentially private sketches $x'$ and $y'$, from which one can estimate the distance between $x$ and $y$. Our estimator relies on the Sparser Johnson-Lindenstrauss constructions by Kane \& Nelson (Journal of the ACM 2014), which for any $0<\alpha,\beta<1/2$ have optimal output dimension $k=\Theta(\alpha^{-2}\log(1/\beta))$ and sparsity $s=O(\alpha^{-1}\log(1/\beta))$. We combine the constructions of Kane \& Nelson with either the Laplace or the Gaussian mechanism from the differential privacy literature, depending on the privacy parameters $\varepsilon$ and $\delta$. We also suggest a differentially private version of Fast Johnson-Lindenstrauss Transform (FJLT) by Ailon \& Chazelle (SIAM Journal of Computing 2009) which offers a tradeoff in speed for variance for certain parameters.

We answer an open question by Kenthapadi et al.~(Journal of Privacy and Confidentiality 2013) by analyzing the privacy and utility guarantees of an estimator for Euclidean distance, relying on Laplacian rather than Gaussian noise. We prove that the Laplace mechanism yields lower variance than the Gaussian mechanism whenever $\delta<\beta^{O(1/\alpha)}$.

Thus, our work poses an improvement over the work of Kenthapadi et al.~by giving a more efficient estimator with lower variance for sufficiently small $\delta$. Our sketch also achieves \emph{pure} differential privacy as a neat side-effect of the Laplace mechanism rather than the \emph{approximate} differential privacy guarantee of the Gaussian mechanism, which may not be sufficiently strong for some settings.

Our main result is a special case of more general, technical results proving that one can generally construct unbiased estimators for Euclidean distance with a high level of utility even under the constraint of differential privacy. The bulk of our analysis is proving that the variance of the estimator does not suffer too much in the presence of differential privacy.

We answer an open question by Kenthapadi et al.~(Journal of Privacy and Confidentiality 2013) by analyzing the privacy and utility guarantees of an estimator for Euclidean distance, relying on Laplacian rather than Gaussian noise. We prove that the Laplace mechanism yields lower variance than the Gaussian mechanism whenever $\delta<\beta^{O(1/\alpha)}$.

Thus, our work poses an improvement over the work of Kenthapadi et al.~by giving a more efficient estimator with lower variance for sufficiently small $\delta$. Our sketch also achieves \emph{pure} differential privacy as a neat side-effect of the Laplace mechanism rather than the \emph{approximate} differential privacy guarantee of the Gaussian mechanism, which may not be sufficiently strong for some settings.

Our main result is a special case of more general, technical results proving that one can generally construct unbiased estimators for Euclidean distance with a high level of utility even under the constraint of differential privacy. The bulk of our analysis is proving that the variance of the estimator does not suffer too much in the presence of differential privacy.

Original language | English |
---|---|

Title of host publication | PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems |

Volume | 40 |

Publisher | Interactions IX: Association of Computing Machinery ACM |

Publication date | Jun 2021 |

Pages | 42–56 |

Article number | 56 |

ISBN (Electronic) | 978-1-4503-8381-3 |

Publication status | Published - Jun 2021 |

## Keywords

- differential privacy
- dimensionality reduction
- Johnson-Lindenstrauss