Abstract
Data analysis has high value both for commercial and research purposes. However, disclosing analysis results may pose severe privacy risk to individuals. Privug is a method to quantify privacy risks of data analytics programs by analyzing their source code. The method uses probability distributions to model attacker knowledge and Bayesian inference to update said knowledge based on observable outputs. Currently, Privug uses Markov Chain Monte Carlo (MCMC) to perform inference, which is a flexible but approximate solution. This paper presents an exact Bayesian inference engine based on multivariate Gaussian distributions to accurately and efficiently quantify privacy risks. The inference engine is implemented for a subset of Python programs that can be modeled as multivariate Gaussian models. We evaluate the method by analyzing privacy risks in programs to release public statistics. The evaluation shows that our method accurately and efficiently analyzes privacy risks, and outperforms existing methods. Furthermore, we demonstrate the use of our engine to analyze the effect of differential privacy in public statistics.
Original language | English |
---|---|
Title of host publication | Proceedings of Software Engineering and Formal Methods (SEFM'23) |
Number of pages | 18 |
Volume | 14323 |
Publisher | Springer, Cham |
Publication date | 31 Oct 2023 |
Pages | 263-281 |
ISBN (Print) | 978-3-031-47114-8 |
ISBN (Electronic) | 978-3-031-47115-5 |
DOIs | |
Publication status | Published - 31 Oct 2023 |
Event | 21st International Conference on Software Engineering and Formal Methods - Eindhoven, Netherlands Duration: 6 Nov 2023 → 10 Nov 2023 https://sefm-conference.github.io/2023/ |
Conference
Conference | 21st International Conference on Software Engineering and Formal Methods |
---|---|
Country/Territory | Netherlands |
City | Eindhoven |
Period | 06/11/2023 → 10/11/2023 |
Internet address |
Series | Lecture Notes in Computer Science |
---|---|
ISSN | 0302-9743 |
Keywords
- Data analysis
- Privacy risks
- Bayesian inference
- Markov Chain Monte Carlo
- Differential privacy