隐私保护数据分析的乘权机制

Moritz Hardt, G. Rothblum
{"title":"隐私保护数据分析的乘权机制","authors":"Moritz Hardt, G. Rothblum","doi":"10.1109/FOCS.2010.85","DOIUrl":null,"url":null,"abstract":"We consider statistical data analysis in the interactive setting. In this setting a trusted curator maintains a database of sensitive information about individual participants, and releases privacy-preserving answers to queries as they arrive. Our primary contribution is a new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen. This is the first mechanism with worst-case accuracy guarantees that can answer large numbers of interactive queries and is {\\em efficient} (in terms of the runtime's dependence on the data universe size). The error is asymptotically \\emph{optimal} in its dependence on the number of participants, and depends only logarithmically on the number of queries being answered. The running time is nearly {\\em linear} in the size of the data universe. As a further contribution, when we relax the utility requirement and require accuracy only for databases drawn from a rich class of databases, we obtain exponential improvements in running time. Even in this relaxed setting we continue to guarantee privacy for {\\em any} input database. Only the utility requirement is relaxed. Specifically, we show that when the input database is drawn from a {\\em smooth} distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes {\\em poly-logarithmic} in the data universe size. The main technical contributions are the application of multiplicative weights techniques to the differential privacy setting, a new privacy analysis for the interactive setting, and a technique for reducing data dimensionality for databases drawn from smooth distributions.","PeriodicalId":228365,"journal":{"name":"2010 IEEE 51st Annual Symposium on Foundations of Computer Science","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"411","resultStr":"{\"title\":\"A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis\",\"authors\":\"Moritz Hardt, G. Rothblum\",\"doi\":\"10.1109/FOCS.2010.85\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider statistical data analysis in the interactive setting. In this setting a trusted curator maintains a database of sensitive information about individual participants, and releases privacy-preserving answers to queries as they arrive. Our primary contribution is a new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen. This is the first mechanism with worst-case accuracy guarantees that can answer large numbers of interactive queries and is {\\\\em efficient} (in terms of the runtime's dependence on the data universe size). The error is asymptotically \\\\emph{optimal} in its dependence on the number of participants, and depends only logarithmically on the number of queries being answered. The running time is nearly {\\\\em linear} in the size of the data universe. As a further contribution, when we relax the utility requirement and require accuracy only for databases drawn from a rich class of databases, we obtain exponential improvements in running time. Even in this relaxed setting we continue to guarantee privacy for {\\\\em any} input database. Only the utility requirement is relaxed. Specifically, we show that when the input database is drawn from a {\\\\em smooth} distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes {\\\\em poly-logarithmic} in the data universe size. The main technical contributions are the application of multiplicative weights techniques to the differential privacy setting, a new privacy analysis for the interactive setting, and a technique for reducing data dimensionality for databases drawn from smooth distributions.\",\"PeriodicalId\":228365,\"journal\":{\"name\":\"2010 IEEE 51st Annual Symposium on Foundations of Computer Science\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"411\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 51st Annual Symposium on Foundations of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FOCS.2010.85\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 51st Annual Symposium on Foundations of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FOCS.2010.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 411

摘要

我们考虑在互动设置统计数据分析。在这种设置中,受信任的管理员维护有关单个参与者的敏感信息的数据库,并在查询到达时发布保留隐私的答案。我们的主要贡献是一种新的差分私有乘权机制,用于回答大量在线的交互式计数(或线性)查询,这些查询可以自适应地选择。这是第一种具有最坏情况准确性保证的机制,它可以回答大量交互式查询,并且{\em效率}很高(就运行时对数据范围大小的依赖而言)。误差是渐近\emph{最优}的,因为它依赖于参与者的数量,并且只依赖于被回答的查询的数量。运行时间在数据空间的大小上几乎是{\em线性}的。作为进一步的贡献,当我们放宽实用性要求,只要求从丰富的数据库类中提取的数据库的准确性时,我们在运行时间上获得了指数级的改进。即使在这种宽松的环境中,我们也继续保证{\em任何}输入数据库的隐私。只有效用要求放宽了。具体地说,我们展示了当输入数据库从{\em平滑}分布中提取时(这种分布不会对任何单个数据项施加过多的权重),精度保持如上所述,并且运行时间在数据范围大小中变为多{\em对数}。主要的技术贡献是将乘法权值技术应用于差分隐私设置,一种新的交互式隐私分析,以及一种从平滑分布中提取的数据库的数据降维技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis
We consider statistical data analysis in the interactive setting. In this setting a trusted curator maintains a database of sensitive information about individual participants, and releases privacy-preserving answers to queries as they arrive. Our primary contribution is a new differentially private multiplicative weights mechanism for answering a large number of interactive counting (or linear) queries that arrive online and may be adaptively chosen. This is the first mechanism with worst-case accuracy guarantees that can answer large numbers of interactive queries and is {\em efficient} (in terms of the runtime's dependence on the data universe size). The error is asymptotically \emph{optimal} in its dependence on the number of participants, and depends only logarithmically on the number of queries being answered. The running time is nearly {\em linear} in the size of the data universe. As a further contribution, when we relax the utility requirement and require accuracy only for databases drawn from a rich class of databases, we obtain exponential improvements in running time. Even in this relaxed setting we continue to guarantee privacy for {\em any} input database. Only the utility requirement is relaxed. Specifically, we show that when the input database is drawn from a {\em smooth} distribution — a distribution that does not place too much weight on any single data item — accuracy remains as above, and the running time becomes {\em poly-logarithmic} in the data universe size. The main technical contributions are the application of multiplicative weights techniques to the differential privacy setting, a new privacy analysis for the interactive setting, and a technique for reducing data dimensionality for databases drawn from smooth distributions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Computational Complexity of Coin Flipping The Monotone Complexity of k-clique on Random Graphs Local List Decoding with a Constant Number of Queries Agnostically Learning under Permutation Invariant Distributions Pseudorandom Generators for Regular Branching Programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1