基于高斯混合模型的差分私有密度估计

Yuncheng Wu, Yao Wu, Hui Peng, Juru Zeng, Hong Chen, Cuiping Li
{"title":"基于高斯混合模型的差分私有密度估计","authors":"Yuncheng Wu, Yao Wu, Hui Peng, Juru Zeng, Hong Chen, Cuiping Li","doi":"10.1109/IWQoS.2016.7590445","DOIUrl":null,"url":null,"abstract":"Density estimation can construct an estimate of the probability density function from the observed data. However, such a function may compromise the privacy of individuals. A notable paradigm for offering strong privacy guarantees in data analysis is differential privacy. In this paper, we propose DPGMM, a parametric density estimation algorithm using Gaussian mixtures model (GMM) under differential privacy. GMM is a well-known model that could approximate any distribution and can be solved via Expectation-Maximization (EM) algorithm. The main idea of DPGMM is to add two extra steps after getting the estimated parameters in the M step of each iteration. The first step is the noise adding step, which injects calibrated noise to the estimated parameters according to their L1-sensitivities and privacy budgets. The second step is the post-processing step, which post-processes those noisy parameters that might break their intrinsic characteristics. Extensive experiments using both real and synthetic datasets evaluate the performance of DPGMM, and demonstrate that the proposed method outperforms a state-of-art approach.","PeriodicalId":304978,"journal":{"name":"2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Differentially private density estimation via Gaussian mixtures model\",\"authors\":\"Yuncheng Wu, Yao Wu, Hui Peng, Juru Zeng, Hong Chen, Cuiping Li\",\"doi\":\"10.1109/IWQoS.2016.7590445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Density estimation can construct an estimate of the probability density function from the observed data. However, such a function may compromise the privacy of individuals. A notable paradigm for offering strong privacy guarantees in data analysis is differential privacy. In this paper, we propose DPGMM, a parametric density estimation algorithm using Gaussian mixtures model (GMM) under differential privacy. GMM is a well-known model that could approximate any distribution and can be solved via Expectation-Maximization (EM) algorithm. The main idea of DPGMM is to add two extra steps after getting the estimated parameters in the M step of each iteration. The first step is the noise adding step, which injects calibrated noise to the estimated parameters according to their L1-sensitivities and privacy budgets. The second step is the post-processing step, which post-processes those noisy parameters that might break their intrinsic characteristics. Extensive experiments using both real and synthetic datasets evaluate the performance of DPGMM, and demonstrate that the proposed method outperforms a state-of-art approach.\",\"PeriodicalId\":304978,\"journal\":{\"name\":\"2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS)\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWQoS.2016.7590445\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS.2016.7590445","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

密度估计可以根据观测数据构造概率密度函数的估计值。然而,这种功能可能会损害个人的隐私。在数据分析中提供强大隐私保障的一个值得注意的范例是差异隐私。本文提出了差分隐私下基于高斯混合模型的参数密度估计算法DPGMM。GMM是一个众所周知的模型,它可以近似任何分布,并可以通过期望最大化(EM)算法求解。DPGMM的主要思想是在每次迭代的M步得到估计参数后,再增加两个额外的步骤。第一步是噪声添加步骤,根据估计参数的l1灵敏度和隐私预算向估计参数注入校准后的噪声。第二步是后处理步骤,即对可能破坏其固有特征的噪声参数进行后处理。使用真实和合成数据集进行的大量实验评估了DPGMM的性能,并证明了所提出的方法优于当前最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Differentially private density estimation via Gaussian mixtures model
Density estimation can construct an estimate of the probability density function from the observed data. However, such a function may compromise the privacy of individuals. A notable paradigm for offering strong privacy guarantees in data analysis is differential privacy. In this paper, we propose DPGMM, a parametric density estimation algorithm using Gaussian mixtures model (GMM) under differential privacy. GMM is a well-known model that could approximate any distribution and can be solved via Expectation-Maximization (EM) algorithm. The main idea of DPGMM is to add two extra steps after getting the estimated parameters in the M step of each iteration. The first step is the noise adding step, which injects calibrated noise to the estimated parameters according to their L1-sensitivities and privacy budgets. The second step is the post-processing step, which post-processes those noisy parameters that might break their intrinsic characteristics. Extensive experiments using both real and synthetic datasets evaluate the performance of DPGMM, and demonstrate that the proposed method outperforms a state-of-art approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MSRT: Multi-Source Request and Transmission in Content-Centric Networks Tube caching: An effective caching scheme in Content-Centric Networking DVMP: Incremental traffic-aware VM placement on heterogeneous servers in data centers Adaptive rate control over mobile data networks with heuristic rate compensations Selecting most informative contributors with unknown costs for budgeted crowdsensing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1