TY - JOUR
T1 - PLAN: Variance-Aware Private Mean Estimation.
AU - Aumüller, Martin
AU - Lebeda, Christian Janos
AU - Nelson, Boel
AU - Pagh, Rasmus
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2024
Y1 - 2024
N2 - Differentially private mean estimation is an important building block in privacy-preserving algorithms for data analysis and machine learning. Though the trade-off between privacy and utility is well understood in the worst case, many datasets exhibit structure that could potentially be exploited to yield better algorithms. In this paper we present Private Limit Adapted Noise (plan), a family of differentially private algorithms for mean estimation in the setting where inputs are independently sampled from a distribution D over R𝑑, with coordinate-wise standard deviations 𝝈 ∈ R𝑑. Similar to mean estimation under Mahalanobis distance, plan tailors the shape of the noise to the shape of the data, but unlike previous algorithms the privacy budget is spent non-uniformly over the coordinates. Under a concentration assumption on D, we show how to exploit skew in the vector 𝝈, obtaining a (zero-concentrated) differentially private mean estimate with ℓ2 error proportional to ∥𝝈 ∥1. Previous work has either not taken 𝝈 into account, or measured error in Mahalanobis distance — in both cases resulting in ℓ2 error proportional to √𝑑 ∥𝝈 ∥2, which can be up to a factor √𝑑 larger. To verify the effectiveness of plan, we empirically evaluate accuracy on both synthetic and real-world data.
AB - Differentially private mean estimation is an important building block in privacy-preserving algorithms for data analysis and machine learning. Though the trade-off between privacy and utility is well understood in the worst case, many datasets exhibit structure that could potentially be exploited to yield better algorithms. In this paper we present Private Limit Adapted Noise (plan), a family of differentially private algorithms for mean estimation in the setting where inputs are independently sampled from a distribution D over R𝑑, with coordinate-wise standard deviations 𝝈 ∈ R𝑑. Similar to mean estimation under Mahalanobis distance, plan tailors the shape of the noise to the shape of the data, but unlike previous algorithms the privacy budget is spent non-uniformly over the coordinates. Under a concentration assumption on D, we show how to exploit skew in the vector 𝝈, obtaining a (zero-concentrated) differentially private mean estimate with ℓ2 error proportional to ∥𝝈 ∥1. Previous work has either not taken 𝝈 into account, or measured error in Mahalanobis distance — in both cases resulting in ℓ2 error proportional to √𝑑 ∥𝝈 ∥2, which can be up to a factor √𝑑 larger. To verify the effectiveness of plan, we empirically evaluate accuracy on both synthetic and real-world data.
KW - differential privacy
KW - mean estimation
U2 - 10.56553/popets-2024-0095
DO - 10.56553/popets-2024-0095
M3 - Tidsskriftartikel
VL - 2024
SP - 606
EP - 625
JO - Proc. Priv. Enhancing Technol.
JF - Proc. Priv. Enhancing Technol.
IS - 3
M1 - 3
ER -