{"title":"PEAFOWL:多方隐私保护机器学习中的私有实体对齐","authors":"Ying Gao;Huanghao Deng;Zukun Zhu;Xiaofeng Chen;Yuxin Xie;Pei Duan;Peixuan Chen","doi":"10.1109/TIFS.2025.3542244","DOIUrl":null,"url":null,"abstract":"In privacy-preserving machine learning with vertically distributed data, private entity alignment methods are used to securely match and utilize features of the same samples. However, existing methods not only risk exposing sample intersections and introducing unnecessary samples but also face a gap in adapting to multi-party scenarios. To address these limitations, we propose P<sc>eafowl</small>, a novel multi-party private entity alignment protocol. P<sc>eafowl</small> achieves entity alignment among multiple parties through a mapping from original datasets to intersections, termed permutation. This method mitigates intersection disclosure and sample redundancy concerns by avoiding direct use of the intersection. The proposed protocol leverages a cloud server that utilizes secret-shared shuffle to protect the privacy of the permutation, in case of colluding data providers reconstructing intersections. Further, by integrating a seed homomorphic pseudorandom generator, P<sc>eafowl</small> avoids the intensive communication of secret sharing and achieves superior runtime performance. Additionally, an offline/online variant is introduced to ensure a linear growth in communication and computation complexity relative to the dataset size by pre-computing permutation calculations. Implemented on a real PPML framework, the protocol demonstrates practical efficiency in various multi-party settings. Experimental results indicate that Peafowl’s overhead is less than 1% of the total training cost, while the offline/online variant achieves approximately a 50% reduction in online runtime. Overall, P<sc>eafowl</small> offers an efficient and straightforward solution for multi-party PPML, making it an attractive option for implementation and future improvements.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"2706-2720"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Peafowl: Private Entity Alignment in Multi-Party Privacy-Preserving Machine Learning\",\"authors\":\"Ying Gao;Huanghao Deng;Zukun Zhu;Xiaofeng Chen;Yuxin Xie;Pei Duan;Peixuan Chen\",\"doi\":\"10.1109/TIFS.2025.3542244\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In privacy-preserving machine learning with vertically distributed data, private entity alignment methods are used to securely match and utilize features of the same samples. However, existing methods not only risk exposing sample intersections and introducing unnecessary samples but also face a gap in adapting to multi-party scenarios. To address these limitations, we propose P<sc>eafowl</small>, a novel multi-party private entity alignment protocol. P<sc>eafowl</small> achieves entity alignment among multiple parties through a mapping from original datasets to intersections, termed permutation. This method mitigates intersection disclosure and sample redundancy concerns by avoiding direct use of the intersection. The proposed protocol leverages a cloud server that utilizes secret-shared shuffle to protect the privacy of the permutation, in case of colluding data providers reconstructing intersections. Further, by integrating a seed homomorphic pseudorandom generator, P<sc>eafowl</small> avoids the intensive communication of secret sharing and achieves superior runtime performance. Additionally, an offline/online variant is introduced to ensure a linear growth in communication and computation complexity relative to the dataset size by pre-computing permutation calculations. Implemented on a real PPML framework, the protocol demonstrates practical efficiency in various multi-party settings. Experimental results indicate that Peafowl’s overhead is less than 1% of the total training cost, while the offline/online variant achieves approximately a 50% reduction in online runtime. Overall, P<sc>eafowl</small> offers an efficient and straightforward solution for multi-party PPML, making it an attractive option for implementation and future improvements.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"2706-2720\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10887259/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887259/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Peafowl: Private Entity Alignment in Multi-Party Privacy-Preserving Machine Learning
In privacy-preserving machine learning with vertically distributed data, private entity alignment methods are used to securely match and utilize features of the same samples. However, existing methods not only risk exposing sample intersections and introducing unnecessary samples but also face a gap in adapting to multi-party scenarios. To address these limitations, we propose Peafowl, a novel multi-party private entity alignment protocol. Peafowl achieves entity alignment among multiple parties through a mapping from original datasets to intersections, termed permutation. This method mitigates intersection disclosure and sample redundancy concerns by avoiding direct use of the intersection. The proposed protocol leverages a cloud server that utilizes secret-shared shuffle to protect the privacy of the permutation, in case of colluding data providers reconstructing intersections. Further, by integrating a seed homomorphic pseudorandom generator, Peafowl avoids the intensive communication of secret sharing and achieves superior runtime performance. Additionally, an offline/online variant is introduced to ensure a linear growth in communication and computation complexity relative to the dataset size by pre-computing permutation calculations. Implemented on a real PPML framework, the protocol demonstrates practical efficiency in various multi-party settings. Experimental results indicate that Peafowl’s overhead is less than 1% of the total training cost, while the offline/online variant achieves approximately a 50% reduction in online runtime. Overall, Peafowl offers an efficient and straightforward solution for multi-party PPML, making it an attractive option for implementation and future improvements.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features