{"title":"Detection of Sparse Mixtures With Differential Privacy","authors":"Ruizhi Zhang","doi":"10.1109/JSAIT.2024.3396079","DOIUrl":null,"url":null,"abstract":"Detection of sparse signals arises in many modern applications such as signal processing, bioinformatics, finance, and disease surveillance. However, in many of these applications, the data may contain sensitive personal information, which is desirable to be protected during the data analysis. In this article, we consider the problem of \n<inline-formula> <tex-math>$(\\epsilon,\\delta)$ </tex-math></inline-formula>\n-differentially private detection of a general sparse mixture with a focus on how privacy affects the detection power. By investigating the nonasymptotic upper bound for the summation of error probabilities, we find any \n<inline-formula> <tex-math>$(\\epsilon,\\delta)$ </tex-math></inline-formula>\n-differentially private test cannot detect the sparse signal if the privacy constraint is too strong or if the model parameters are in the undetectable region (Cai and Wu, 2014). Moreover, we study the private clamped log-likelihood ratio test proposed by Canonne et al., 2019 and show it achieves vanishing error probabilities in some conditions on the model parameters and privacy parameters. Then, for the case when the null distribution is a standard normal distribution, we propose an adaptive \n<inline-formula> <tex-math>$(\\epsilon,\\delta)$ </tex-math></inline-formula>\n-differentially private test, which achieves vanishing error probabilities in the same detectable region (Cai and Wu, 2014) when the privacy parameters satisfy certain sufficient conditions. Several numerical experiments are conducted to verify our theoretical results and illustrate the performance of our proposed test.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"347-356"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in information theory","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10521599/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Detection of sparse signals arises in many modern applications such as signal processing, bioinformatics, finance, and disease surveillance. However, in many of these applications, the data may contain sensitive personal information, which is desirable to be protected during the data analysis. In this article, we consider the problem of
$(\epsilon,\delta)$
-differentially private detection of a general sparse mixture with a focus on how privacy affects the detection power. By investigating the nonasymptotic upper bound for the summation of error probabilities, we find any
$(\epsilon,\delta)$
-differentially private test cannot detect the sparse signal if the privacy constraint is too strong or if the model parameters are in the undetectable region (Cai and Wu, 2014). Moreover, we study the private clamped log-likelihood ratio test proposed by Canonne et al., 2019 and show it achieves vanishing error probabilities in some conditions on the model parameters and privacy parameters. Then, for the case when the null distribution is a standard normal distribution, we propose an adaptive
$(\epsilon,\delta)$
-differentially private test, which achieves vanishing error probabilities in the same detectable region (Cai and Wu, 2014) when the privacy parameters satisfy certain sufficient conditions. Several numerical experiments are conducted to verify our theoretical results and illustrate the performance of our proposed test.
稀疏信号的检测在信号处理、生物信息学、金融和疾病监测等许多现代应用中都会出现。然而,在许多此类应用中,数据可能包含敏感的个人信息,这就需要在数据分析过程中加以保护。在本文中,我们考虑了一般稀疏混合物的$(\epsilon,\delta)$差异隐私检测问题,重点关注隐私如何影响检测能力。通过研究误差概率求和的非渐近上界,我们发现如果隐私约束太强或模型参数处于不可检测区域,任何$(\epsilon,\delta)$-差异隐私检测都无法检测到稀疏信号(Cai and Wu,2014)。此外,我们还研究了 Canonne 等人 2019 年提出的私有钳位对数似然比检验,结果表明它在模型参数和隐私参数的某些条件下实现了虚化误差概率。然后,对于空分布是标准正态分布的情况,我们提出了一种自适应的$(\epsilon,\delta)$-差异私有检验,当隐私参数满足某些充分条件时,它在相同的可检测区域内实现了消失的误差概率(Cai and Wu, 2014)。我们进行了一些数值实验来验证我们的理论结果,并说明我们提出的测试的性能。