Your Labels are Selling You Out: Relation Leaks in Vertical Federated Learning

IF 7 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Dependable and Secure Computing Pub Date : 2023-09-01 DOI:10.1109/TDSC.2022.3208630

Pengyu Qiu, Xuhong Zhang, S. Ji, Tianyu Du, Yuwen Pu, Junfeng Zhou, Ting Wang

{"title":"Your Labels are Selling You Out: Relation Leaks in Vertical Federated Learning","authors":"Pengyu Qiu, Xuhong Zhang, S. Ji, Tianyu Du, Yuwen Pu, Junfeng Zhou, Ting Wang","doi":"10.1109/TDSC.2022.3208630","DOIUrl":null,"url":null,"abstract":"Vertical federated learning (VFL) is an emerging privacy-preserving paradigm that enables collaboration between companies. These companies have the same set of users but different features. One of them is interested in expanding new business or improving its current service with others’ features. For instance, an e-commerce company, who wants to improve its recommendation performance, can incorporate users’ preferences from another corporation such as a social media company through VFL. On the other hand, graph data is a powerful and sensitive type of data widely used in industry. Their leakage, e.g., the node leakage and/or the relation leakage, can cause severe privacy issues and financial loss. Therefore, protecting the security of graph data is important in practice. Though a line of work has studied how to learn with graph data in VFL, the privacy risks remain underexplored. In this paper, we perform the first systematic study on relation inference attacks to reveal VFL's risk of leaking samples’ relations. Specifically, we assume the adversary to be a semi-honest participant. Then, according to the adversary's knowledge level, we formulate three kinds of attacks based on different intermediate representations. Particularly, we design a novel numerical approximation method to handle VFL's encryption mechanism on the participant's representations. Extensive evaluations with four real-world datasets demonstrate the effectiveness of our attacks. For instance, the area under curve of relation inference can reach more than 90%, implying an impressive relation inference capability. Furthermore, we evaluate possible defenses to examine our attacks’ robustness. The results show that their impacts are limited. Our work highlights the need for advanced defenses to protect private relations and calls for more exploration of VFL's privacy and security issues.","PeriodicalId":13047,"journal":{"name":"IEEE Transactions on Dependable and Secure Computing","volume":"20 1","pages":"3653-3668"},"PeriodicalIF":7.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Dependable and Secure Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TDSC.2022.3208630","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 12

Abstract

Vertical federated learning (VFL) is an emerging privacy-preserving paradigm that enables collaboration between companies. These companies have the same set of users but different features. One of them is interested in expanding new business or improving its current service with others’ features. For instance, an e-commerce company, who wants to improve its recommendation performance, can incorporate users’ preferences from another corporation such as a social media company through VFL. On the other hand, graph data is a powerful and sensitive type of data widely used in industry. Their leakage, e.g., the node leakage and/or the relation leakage, can cause severe privacy issues and financial loss. Therefore, protecting the security of graph data is important in practice. Though a line of work has studied how to learn with graph data in VFL, the privacy risks remain underexplored. In this paper, we perform the first systematic study on relation inference attacks to reveal VFL's risk of leaking samples’ relations. Specifically, we assume the adversary to be a semi-honest participant. Then, according to the adversary's knowledge level, we formulate three kinds of attacks based on different intermediate representations. Particularly, we design a novel numerical approximation method to handle VFL's encryption mechanism on the participant's representations. Extensive evaluations with four real-world datasets demonstrate the effectiveness of our attacks. For instance, the area under curve of relation inference can reach more than 90%, implying an impressive relation inference capability. Furthermore, we evaluate possible defenses to examine our attacks’ robustness. The results show that their impacts are limited. Our work highlights the need for advanced defenses to protect private relations and calls for more exploration of VFL's privacy and security issues.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

你的标签出卖了你:垂直联合学习中的关系泄漏

垂直联合学习（VFL）是一种新兴的隐私保护模式，可以实现公司之间的协作。这些公司拥有相同的用户群，但功能不同。其中一家公司有兴趣扩大新业务或利用其他公司的功能改进现有服务。例如，一家电子商务公司想要提高其推荐性能，可以通过VFL整合来自另一家公司（如社交媒体公司）的用户偏好。另一方面，图形数据是一种在工业中广泛使用的强大而敏感的数据类型。它们的泄漏，例如节点泄漏和/或关系泄漏，可能会导致严重的隐私问题和财务损失。因此，保护图形数据的安全性在实践中具有重要意义。尽管已经有一系列工作研究了如何在VFL中使用图形数据进行学习，但隐私风险仍未得到充分挖掘。在本文中，我们对关系推理攻击进行了首次系统研究，以揭示VFL泄露样本关系的风险。具体来说，我们假设对手是一个半诚实的参与者。然后，根据对手的知识水平，我们基于不同的中间表示制定了三种攻击。特别地，我们设计了一种新的数值近似方法来处理VFL对参与者表示的加密机制。对四个真实世界数据集的广泛评估证明了我们攻击的有效性。例如，关系推理的曲线下面积可以达到90%以上，这意味着关系推理能力令人印象深刻。此外，我们评估了可能的防御，以检查我们的攻击的稳健性。结果表明，它们的影响是有限的。我们的工作强调了保护私人关系的先进防御的必要性，并呼吁对VFL的隐私和安全问题进行更多的探索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Dependable and Secure Computing 工程技术-计算机：软件工程

CiteScore

11.20

自引率

5.50%

发文量

354

审稿时长

9 months

期刊介绍： The "IEEE Transactions on Dependable and Secure Computing (TDSC)" is a prestigious journal that publishes high-quality, peer-reviewed research in the field of computer science, specifically targeting the development of dependable and secure computing systems and networks. This journal is dedicated to exploring the fundamental principles, methodologies, and mechanisms that enable the design, modeling, and evaluation of systems that meet the required levels of reliability, security, and performance. The scope of TDSC includes research on measurement, modeling, and simulation techniques that contribute to the understanding and improvement of system performance under various constraints. It also covers the foundations necessary for the joint evaluation, verification, and design of systems that balance performance, security, and dependability. By publishing archival research results, TDSC aims to provide a valuable resource for researchers, engineers, and practitioners working in the areas of cybersecurity, fault tolerance, and system reliability. The journal's focus on cutting-edge research ensures that it remains at the forefront of advancements in the field, promoting the development of technologies that are critical for the functioning of modern, complex systems.