The Opportunity in Difficulty: A Dynamic Privacy Budget Allocation Mechanism for Privacy-Preserving Multi-dimensional Data Collection

IF 3.6 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Management Information Systems Pub Date : 2022-10-29 DOI:10.1145/3569944

Xue Chen, Cheng Wang, Qing Yang, Teng Hu, Changjun Jiang

{"title":"The Opportunity in Difficulty: A Dynamic Privacy Budget Allocation Mechanism for Privacy-Preserving Multi-dimensional Data Collection","authors":"Xue Chen, Cheng Wang, Qing Yang, Teng Hu, Changjun Jiang","doi":"10.1145/3569944","DOIUrl":null,"url":null,"abstract":"Data collection under local differential privacy (LDP) has been gradually on the stage. Compared with the implementation of LDP on the single attribute data collection, that on multi-dimensional data faces great challenges as follows: (1) Communication cost. Multivariate data collection needs to retain the correlations between attributes, which means that more complex privatization mechanisms will result in more communication costs. (2) Noise scale. More attributes have to share the privacy budget limited by data utility and privacy-preserving level, which means that less privacy budget can be allocated to each of them, resulting in more noise added to the data. In this work, we innovatively reverse the complex multi-dimensional attributes, i.e., the major negative factor that leads to the above difficulties, to act as a beneficial factor to improve the efficiency of privacy budget allocation, so as to realize a multi-dimensional data collection under LDP with high comprehensive performance. Specifically, we first present a Multivariate k-ary Randomized Response (kRR) mechanism, called Multi-kRR. It applies the RR directly to each attribute to reduce the communication cost. To deal with the impact of a large amount of noise, we propose a Markov-based dynamic privacy budget allocation mechanism Markov-kRR, which determines the present privacy budget (flipping probability) of an attribute related to the state of the previous attributes. Then, we fix the threshold of flipping times in Markov-kRR and propose an improved mechanism called MarkFixed-kRR, which can obtain more optimized utility by choosing the suitable threshold. Finally, extensive experiments demonstrate the efficiency and effectiveness of our proposed methods.","PeriodicalId":45274,"journal":{"name":"ACM Transactions on Management Information Systems","volume":" ","pages":"1 - 24"},"PeriodicalIF":3.6000,"publicationDate":"2022-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Management Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3569944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 3

Abstract

Data collection under local differential privacy (LDP) has been gradually on the stage. Compared with the implementation of LDP on the single attribute data collection, that on multi-dimensional data faces great challenges as follows: (1) Communication cost. Multivariate data collection needs to retain the correlations between attributes, which means that more complex privatization mechanisms will result in more communication costs. (2) Noise scale. More attributes have to share the privacy budget limited by data utility and privacy-preserving level, which means that less privacy budget can be allocated to each of them, resulting in more noise added to the data. In this work, we innovatively reverse the complex multi-dimensional attributes, i.e., the major negative factor that leads to the above difficulties, to act as a beneficial factor to improve the efficiency of privacy budget allocation, so as to realize a multi-dimensional data collection under LDP with high comprehensive performance. Specifically, we first present a Multivariate k-ary Randomized Response (kRR) mechanism, called Multi-kRR. It applies the RR directly to each attribute to reduce the communication cost. To deal with the impact of a large amount of noise, we propose a Markov-based dynamic privacy budget allocation mechanism Markov-kRR, which determines the present privacy budget (flipping probability) of an attribute related to the state of the previous attributes. Then, we fix the threshold of flipping times in Markov-kRR and propose an improved mechanism called MarkFixed-kRR, which can obtain more optimized utility by choosing the suitable threshold. Finally, extensive experiments demonstrate the efficiency and effectiveness of our proposed methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

困难中的机遇:一种保护隐私的多维数据收集动态隐私预算分配机制

局部差分隐私（LDP）下的数据采集已逐步走上舞台。与LDP在单属性数据采集上的实现相比，LDP在多维数据上的实现面临着以下巨大挑战：（1）通信成本。多元数据收集需要保留属性之间的相关性，这意味着更复杂的私有化机制将导致更多的通信成本。（2）噪音等级。更多的属性必须共享受数据实用性和隐私保护级别限制的隐私预算，这意味着可以为每个属性分配更少的隐私预算。这会给数据添加更多的噪声。在这项工作中，我们创新性地扭转了复杂的多维属性，即导致上述困难的主要负面因素，作为提高隐私预算分配效率的有利因素，从而实现LDP下的高综合性能多维数据收集。具体来说，我们首先提出了一种多变量k元随机反应（kRR）机制，称为多kRR。它将RR直接应用于每个属性，以降低通信成本。为了应对大量噪声的影响，我们提出了一种基于马尔可夫的动态隐私预算分配机制Markov kRR，该机制确定与先前属性的状态相关的属性的当前隐私预算（翻转概率）。然后，我们在Markov kRR中固定了翻转次数的阈值，并提出了一种改进的机制MarkFixed kRR，通过选择合适的阈值可以获得更优化的效用。最后，大量的实验证明了我们提出的方法的有效性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Management Information Systems COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

6.30

自引率

20.00%

发文量