Quantifying Privacy Risk with Gaussian Mixtures

Publikation: Artikel i tidsskrift og konference artikel i tidsskriftTidsskriftartikelForskningpeer review

Abstract

Data anonymization methods gain legal importance as data collection and analysis are expanding dramatically in data management and statistical research. Yet applying anonymization, or understanding how well a given analytics program hides sensitive information, is non-trivial. Privug is a method to quantify privacy risks of data analytics programs by analyzing their source code. The method uses probability distributions to model attacker knowledge and Bayesian inference to update said knowledge based on observable outputs. Currently, Privug is equipped with approximate Bayesian inference methods (such as Markov Chain Monte Carlo), and an exact Bayesian inference method based on multivariate Gaussian distributions. This paper introduces a privacy risk analysis engine based on Gaussian mixture models that combines exact and approximate inference. It extends the multivariate Gaussian engine by supporting exact inference in programs with continuous and discrete distributions as well as if-statements. Furthermore, the engine allows for approximating attacker knowledge that is not normally distributed. We evaluate the method by analyzing privacy risks in programs to release public statistics, differential privacy mechanisms, randomized response and attribute generalization. Finally, we show that our engine can be used to analyze programs involving thousands of sensitive records.
OriginalsprogEngelsk
TidsskriftSoftware and Systems Modeling (SoSyM)
Sider (fra-til)1-22
Antal sider22
DOI
StatusUdgivet - 23 jun. 2025

Fingeraftryk

Dyk ned i forskningsemnerne om 'Quantifying Privacy Risk with Gaussian Mixtures'. Sammen danner de et unikt fingeraftryk.

Citationsformater