An Attack-Defense Game-Based Reinforcement Learning Privacy-Preserving Method Against Inference Attack in Double Auction Market

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-11-18 DOI:10.1109/TASE.2024.3496869
Donghe Li;Chunlin Hu;Qingyu Yang;Yuhao Ma;Feiye Zhang;Dou An
{"title":"An Attack-Defense Game-Based Reinforcement Learning Privacy-Preserving Method Against Inference Attack in Double Auction Market","authors":"Donghe Li;Chunlin Hu;Qingyu Yang;Yuhao Ma;Feiye Zhang;Dou An","doi":"10.1109/TASE.2024.3496869","DOIUrl":null,"url":null,"abstract":"Auction mechanism, as a fair and efficient resource allocation method, has been widely used in varieties trading scenarios, such as advertising, crowdsensoring and spectrum. However, in addition to obtaining higher profits and satisfaction, the privacy concerns have attracted researchers’ attention. In this paper, we mainly study the privacy preserving issue in the double auction market against the indirect inference attack. Most of the existing works apply differential privacy theory to defend against the inference attack, but there exists two problems. First, ‘indistinguishability’ of differential privacy (DP) cannot prevent the disclosure of continuous valuations in the auction market. Second, the privacy-utility trade-off (PUT) in differential privacy deployment has not been resolved. To this end, we proposed an attack-defense game-based reinforcement learning privacy-preserving method to provide practically privacy protection in double auction. First, the auctioneer acts as defender, adds noise to the bidders’ valuations, and then acts as adversary to launch inference attack. After that the auctioneer uses the attack results and auction results as a reference to guide the next deployment. The above process can be regarded as a Markov Decision Process (MDP). The state is the valuations of each bidders under the current steps. The action is the noise added to each bidders. The reward is composed of privacy, utility and training speed, in which attack success rate and social welfare are taken as measures of privacy and utility, a delay penalty term is used to reduce the training time. Utilizing the deep deterministic policy gradient (DDPG) algorithm, we establish an actor-critic network to solve the problem of MDP. Finally, we conducted extensive evaluations to verify the performance of our proposed method. The results show that compared with other existing DP-based double auction privacy preserving mechanisms, our method can achieve better results in both privacy and utility. We can reduce the attack success rate from nearly 100% to less than 20%, and the utility deviation is less than 5%. Note to Practitioners—Privacy protection in trading markets, such as advertising, crowdsensing, and spectrum, is crucial. Traditional approaches like differential privacy have been unable to entirely guard sensitive data against inference attacks. To address this, we introduce a novel privacy-preserving mechanism for double auction markets. Our approach employs an attack-defense game model, where noise is added to bidders’ valuations and then used to launch an inference attack. This process allows for the evaluation of the noise’s effectiveness and iteratively refines the privacy protection method. Transformed into a reinforcement learning model and optimized through a DDPG network, our mechanism reduces computational complexity. It has been shown to significantly diminish the success rate of inference attacks, while maintaining a minimal utility deviation. Practitioners in auction-based markets can leverage our approach to enhance privacy protection without negatively impacting market performance. By integrating our mechanism into their operations, auctioneers can foster a safer and more efficient trading environment.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"9075-9089"},"PeriodicalIF":6.4000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10755104/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Auction mechanism, as a fair and efficient resource allocation method, has been widely used in varieties trading scenarios, such as advertising, crowdsensoring and spectrum. However, in addition to obtaining higher profits and satisfaction, the privacy concerns have attracted researchers’ attention. In this paper, we mainly study the privacy preserving issue in the double auction market against the indirect inference attack. Most of the existing works apply differential privacy theory to defend against the inference attack, but there exists two problems. First, ‘indistinguishability’ of differential privacy (DP) cannot prevent the disclosure of continuous valuations in the auction market. Second, the privacy-utility trade-off (PUT) in differential privacy deployment has not been resolved. To this end, we proposed an attack-defense game-based reinforcement learning privacy-preserving method to provide practically privacy protection in double auction. First, the auctioneer acts as defender, adds noise to the bidders’ valuations, and then acts as adversary to launch inference attack. After that the auctioneer uses the attack results and auction results as a reference to guide the next deployment. The above process can be regarded as a Markov Decision Process (MDP). The state is the valuations of each bidders under the current steps. The action is the noise added to each bidders. The reward is composed of privacy, utility and training speed, in which attack success rate and social welfare are taken as measures of privacy and utility, a delay penalty term is used to reduce the training time. Utilizing the deep deterministic policy gradient (DDPG) algorithm, we establish an actor-critic network to solve the problem of MDP. Finally, we conducted extensive evaluations to verify the performance of our proposed method. The results show that compared with other existing DP-based double auction privacy preserving mechanisms, our method can achieve better results in both privacy and utility. We can reduce the attack success rate from nearly 100% to less than 20%, and the utility deviation is less than 5%. Note to Practitioners—Privacy protection in trading markets, such as advertising, crowdsensing, and spectrum, is crucial. Traditional approaches like differential privacy have been unable to entirely guard sensitive data against inference attacks. To address this, we introduce a novel privacy-preserving mechanism for double auction markets. Our approach employs an attack-defense game model, where noise is added to bidders’ valuations and then used to launch an inference attack. This process allows for the evaluation of the noise’s effectiveness and iteratively refines the privacy protection method. Transformed into a reinforcement learning model and optimized through a DDPG network, our mechanism reduces computational complexity. It has been shown to significantly diminish the success rate of inference attacks, while maintaining a minimal utility deviation. Practitioners in auction-based markets can leverage our approach to enhance privacy protection without negatively impacting market performance. By integrating our mechanism into their operations, auctioneers can foster a safer and more efficient trading environment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于游戏的攻击防御强化学习方法--针对双重拍卖市场中推理攻击的隐私保护方法
拍卖机制作为一种公平高效的资源配置方式,已广泛应用于广告、众筹、频谱等品种交易场景。然而,除了获得更高的利润和满意度外,隐私问题也引起了研究者的关注。本文主要研究了双重拍卖市场中针对间接推理攻击的隐私保护问题。现有的工作大多采用差分隐私理论来防御推理攻击,但存在两个问题。首先,差分隐私(DP)的“不可区分性”不能阻止拍卖市场中持续估值的披露。其次,差分隐私部署中的隐私效用权衡(PUT)问题没有得到解决。为此,我们提出了一种基于攻防博弈的强化学习隐私保护方法,为双拍卖提供切实可行的隐私保护。首先,拍卖人作为辩护人,对竞买人的估价增加噪音,然后作为对手发起推理攻击。之后,拍卖师使用攻击结果和拍卖结果作为参考来指导下一步的部署。上述过程可视为马尔可夫决策过程(MDP)。国家是每个竞标者在当前步骤下的估值。动作是增加到每个投标人的噪音。奖励由隐私、效用和训练速度组成,其中以攻击成功率和社会福利作为衡量隐私和效用的指标,使用延迟惩罚项来减少训练时间。利用深度确定性策略梯度(deep deterministic policy gradient, DDPG)算法,我们建立了一个行动者-评论家网络来解决MDP问题。最后,我们进行了广泛的评估来验证我们提出的方法的性能。结果表明,与现有的基于dp的双拍卖隐私保护机制相比,我们的方法在隐私性和实用性方面都取得了更好的效果。我们可以将攻击成功率从接近100%降低到20%以下,并且效用偏差小于5%。从业人员注意事项——交易市场中的隐私保护,如广告、众感和频谱,是至关重要的。像差分隐私这样的传统方法无法完全保护敏感数据免受推理攻击。为了解决这个问题,我们为双重拍卖市场引入了一种新的隐私保护机制。我们的方法采用了一种攻防博弈模型,将噪音添加到竞标者的估值中,然后用于发起推理攻击。该过程允许对噪声的有效性进行评估,并迭代地改进隐私保护方法。转化为强化学习模型并通过DDPG网络进行优化,我们的机制降低了计算复杂度。它已被证明可以显著降低推理攻击的成功率,同时保持最小的效用偏差。拍卖市场的从业者可以利用我们的方法来加强隐私保护,而不会对市场表现产生负面影响。通过将我们的机制整合到他们的业务中,拍卖商可以营造一个更安全、更高效的交易环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
CoVA-IL: Zero-Shot Imitation Learning via Contrastive Viewpoint Alignment on Object-Centric Representation Self-Organized Team Formation via Multi-Task Hedonic Games for Capability-Heterogeneous Human-Machine Agents Perceptive Locomotion and Navigation for Quadruped Robots via Depth-Based Representation Data-Driven Inverse Reinforcement Learning for Markov Multiplayer Tidal Turbine Systems A Hierarchical Path Planning Framework for Large-Scale Sparse Environments: Layered Grid Refinement and Bidirectional Shortcuts with Application to Offshore Wind Farm Inspection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1