垂直联邦学习预测过程中隐私泄露的综合分析

Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium Pub Date : 2022-03-03 DOI:10.2478/popets-2022-0045

Xue Jiang, Xuebing Zhou, Jens Grossklags

{"title":"垂直联邦学习预测过程中隐私泄露的综合分析","authors":"Xue Jiang, Xuebing Zhou, Jens Grossklags","doi":"10.2478/popets-2022-0045","DOIUrl":null,"url":null,"abstract":"Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":"2022 1","pages":"263 - 281"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Comprehensive Analysis of Privacy Leakage in Vertical Federated Learning During Prediction\",\"authors\":\"Xue Jiang, Xuebing Zhou, Jens Grossklags\",\"doi\":\"10.2478/popets-2022-0045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.\",\"PeriodicalId\":74556,\"journal\":{\"name\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"volume\":\"2022 1\",\"pages\":\"263 - 281\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/popets-2022-0045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/popets-2022-0045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 25

摘要

摘要垂直联邦学习(Vertical federated learning, VFL)是联邦学习的一种变体，近年来受到越来越多的关注。拥有真实标签的主动方与其他方(称为被动方)共同训练模型，以使用更多的特征来达到更高的模型精度。在预测阶段，各方协同计算每条目标记录的预测置信度分数，最终将结果返回给活动方。然而，最近Luo等人的一项研究指出，主动方可以利用这些置信度得分来重构被动方的特征，从而造成严重的隐私泄露。在本文中，我们在预测阶段对VFL框架中的隐私泄漏进行了全面分析。我们的研究在两个方面改进了前人的工作b[28]。我们首先设计了一个通用的基于梯度的重构攻击框架，该框架可以灵活地应用于简单的逻辑回归模型和多层神经网络。此外，除了在白盒设置下进行攻击外，我们还首次尝试在黑盒设置下进行攻击。在大量真实数据集上进行的大量实验表明，我们提出的攻击在不同的设置下都是有效的，与以前的工作相比，最多可以将攻击误差减少两到三倍。我们进一步分析了一系列潜在的缓解方法，并比较了它们的隐私效用性能。实验结果表明，在VFL框架中，在预测阶段，置信度分数的隐私泄露是一个重大的隐私风险，不能简单地通过基于加密的保密方法来解决。另一方面，采用信息压缩和随机化的方法处理置信度分数，可以加强隐私保护。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comprehensive Analysis of Privacy Leakage in Vertical Federated Learning During Prediction

Abstract Vertical federated learning (VFL), a variant of federated learning, has recently attracted increasing attention. An active party having the true labels jointly trains a model with other parties (referred to as passive parties) in order to use more features to achieve higher model accuracy. During the prediction phase, all the parties collaboratively compute the predicted confidence scores of each target record and the results will be finally returned to the active party. However, a recent study by Luo et al. [28] pointed out that the active party can use these confidence scores to reconstruct passive-party features and cause severe privacy leakage. In this paper, we conduct a comprehensive analysis of privacy leakage in VFL frameworks during the prediction phase. Our study improves on previous work [28] regarding two aspects. We first design a general gradient-based reconstruction attack framework that can be flexibly applied to simple logistic regression models as well as multi-layer neural networks. Moreover, besides performing the attack under the white-box setting, we give the first attempt to conduct the attack under the black-box setting. Extensive experiments on a number of real-world datasets show that our proposed attack is effective under different settings and can achieve at best twice or thrice of a reduction of attack error compared to previous work [28]. We further analyze a list of potential mitigation approaches and compare their privacy-utility performances. Experimental results demonstrate that privacy leakage from the confidence scores is a substantial privacy risk in VFL frameworks during the prediction phase, which cannot be simply solved by crypto-based confidentiality approaches. On the other hand, processing the confidence scores with information compression and randomization approaches can provide strengthened privacy protection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊