Improved Differentially Private Euclidean Distance Approximation

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

View graph of relations

This work shows how to privately and more accurately estimate Euclidean distance between pairs of vectors. Input vectors $x$ and $y$ are mapped to differentially private sketches $x'$ and $y'$, from which one can estimate the distance between $x$ and $y$. Our estimator relies on the Sparser Johnson-Lindenstrauss constructions by Kane \& Nelson (Journal of the ACM 2014), which for any $0<\alpha,\beta<1/2$ have optimal output dimension $k=\Theta(\alpha^{-2}\log(1/\beta))$ and sparsity $s=O(\alpha^{-1}\log(1/\beta))$. We combine the constructions of Kane \& Nelson with either the Laplace or the Gaussian mechanism from the differential privacy literature, depending on the privacy parameters $\varepsilon$ and $\delta$. We also suggest a differentially private version of Fast Johnson-Lindenstrauss Transform (FJLT) by Ailon \& Chazelle (SIAM Journal of Computing 2009) which offers a tradeoff in speed for variance for certain parameters.
We answer an open question by Kenthapadi et al.~(Journal of Privacy and Confidentiality 2013) by analyzing the privacy and utility guarantees of an estimator for Euclidean distance, relying on Laplacian rather than Gaussian noise. We prove that the Laplace mechanism yields lower variance than the Gaussian mechanism whenever $\delta<\beta^{O(1/\alpha)}$.
Thus, our work poses an improvement over the work of Kenthapadi et al.~by giving a more efficient estimator with lower variance for sufficiently small $\delta$. Our sketch also achieves \emph{pure} differential privacy as a neat side-effect of the Laplace mechanism rather than the \emph{approximate} differential privacy guarantee of the Gaussian mechanism, which may not be sufficiently strong for some settings.

Our main result is a special case of more general, technical results proving that one can generally construct unbiased estimators for Euclidean distance with a high level of utility even under the constraint of differential privacy. The bulk of our analysis is proving that the variance of the estimator does not suffer too much in the presence of differential privacy.
Original languageEnglish
Title of host publicationPODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
PublisherInteractions IX: Association of Computing Machinery ACM
Publication dateJun 2021
Article number56
ISBN (Electronic)978-1-4503-8381-3
Publication statusPublished - Jun 2021

Bibliographical note

Doesn't seem to count as OA? (jcg: 14/02/2022)


    Research areas

  • differential privacy, dimensionality reduction, Johnson-Lindenstrauss

ID: 86475162