实现基于拆分学习的隐私保护记录链接

arXiv - CS - Databases Pub Date : 2024-09-02 DOI:arxiv-2409.01088

Michail Zervas, Alexandros Karakasidis

{"title":"实现基于拆分学习的隐私保护记录链接","authors":"Michail Zervas, Alexandros Karakasidis","doi":"arxiv-2409.01088","DOIUrl":null,"url":null,"abstract":"Split Learning has been recently introduced to facilitate applications where\nuser data privacy is a requirement. However, it has not been thoroughly studied\nin the context of Privacy-Preserving Record Linkage, a problem in which the\nsame real-world entity should be identified among databases from different\ndataholders, but without disclosing any additional information. In this paper,\nwe investigate the potentials of Split Learning for Privacy-Preserving Record\nMatching, by introducing a novel training method through the utilization of\nReference Sets, which are publicly available data corpora, showcasing minimal\nmatching impact against a traditional centralized SVM-based technique.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":"113 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Split Learning-based Privacy-Preserving Record Linkage\",\"authors\":\"Michail Zervas, Alexandros Karakasidis\",\"doi\":\"arxiv-2409.01088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Split Learning has been recently introduced to facilitate applications where\\nuser data privacy is a requirement. However, it has not been thoroughly studied\\nin the context of Privacy-Preserving Record Linkage, a problem in which the\\nsame real-world entity should be identified among databases from different\\ndataholders, but without disclosing any additional information. In this paper,\\nwe investigate the potentials of Split Learning for Privacy-Preserving Record\\nMatching, by introducing a novel training method through the utilization of\\nReference Sets, which are publicly available data corpora, showcasing minimal\\nmatching impact against a traditional centralized SVM-based technique.\",\"PeriodicalId\":501123,\"journal\":{\"name\":\"arXiv - CS - Databases\",\"volume\":\"113 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01088\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

拆分学习（Split Learning）最近被引入到对用户数据隐私有要求的应用中。然而，在保护隐私的记录链接（Privacy-Preserving Record Linkage）问题上，它还没有得到深入研究，在这个问题中，需要在来自不同数据持有者的数据库中识别出相同的现实世界实体，但不能泄露任何额外信息。在本文中，我们通过利用参考集（公开可用的数据集）引入了一种新颖的训练方法，研究了拆分学习在隐私保护记录匹配中的潜力，与传统的基于 SVM 的集中式技术相比，拆分学习对匹配的影响最小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Towards Split Learning-based Privacy-Preserving Record Linkage

Split Learning has been recently introduced to facilitate applications where user data privacy is a requirement. However, it has not been thoroughly studied in the context of Privacy-Preserving Record Linkage, a problem in which the same real-world entity should be identified among databases from different dataholders, but without disclosing any additional information. In this paper, we investigate the potentials of Split Learning for Privacy-Preserving Record Matching, by introducing a novel training method through the utilization of Reference Sets, which are publicly available data corpora, showcasing minimal matching impact against a traditional centralized SVM-based technique.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Databases

自引率

0.00%

发文量

期刊最新文献

Development of Data Evaluation Benchmark for Data Wrangling Recommendation System Messy Code Makes Managing ML Pipelines Difficult? Just Let LLMs Rewrite the Code! Fast and Adaptive Bulk Loading of Multidimensional Points Matrix Profile for Anomaly Detection on Multidimensional Time Series Extending predictive process monitoring for collaborative processes