{"title":"(ε, k)-随机匿名化:ε-基于k-匿名的差异私有数据共享","authors":"Akito Yamamoto, E. Kimura, T. Shibuya","doi":"10.5220/0011665600003414","DOIUrl":null,"url":null,"abstract":": As the amount of biomedical and healthcare data increases, data mining for medicine becomes more and more important for health improvement. At the same time, privacy concerns in data utilization have also been growing. The key concepts for privacy protection are k -anonymity and differential privacy, but k -anonymity alone cannot protect personal presence information, and differential privacy alone would leak the identity. To promote data sharing throughout the world, universal methods to release the entire data while satisfying both concepts are required, but such a method does not yet exist. Therefore, we propose a novel privacy-preserving method, ( ε , k ) -Randomized Anonymization. In this paper, we first present two methods that compose the Randomized Anonymization method. They perform k -anonymization and randomized response in sequence and have adequate randomness and high privacy guarantees, respectively. Then, we show the algorithm for ( ε , k ) -Randomized Anonymization, which can provide highly accurate outputs with both k -anonymity and differential privacy. In addition, we describe the analysis procedures for each method using an inverse matrix and expectation-maximization (EM) algorithm. In the experiments, we used real data to evaluate our methods’ anonymity, privacy level, and accuracy. Furthermore, we show several examples of analysis results to demonstrate high utility of the proposed methods.","PeriodicalId":20676,"journal":{"name":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","volume":"184 6 1","pages":"287-297"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"(ε, k)-Randomized Anonymization: ε-Differentially Private Data Sharing with k-Anonymity\",\"authors\":\"Akito Yamamoto, E. Kimura, T. Shibuya\",\"doi\":\"10.5220/0011665600003414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": As the amount of biomedical and healthcare data increases, data mining for medicine becomes more and more important for health improvement. At the same time, privacy concerns in data utilization have also been growing. The key concepts for privacy protection are k -anonymity and differential privacy, but k -anonymity alone cannot protect personal presence information, and differential privacy alone would leak the identity. To promote data sharing throughout the world, universal methods to release the entire data while satisfying both concepts are required, but such a method does not yet exist. Therefore, we propose a novel privacy-preserving method, ( ε , k ) -Randomized Anonymization. In this paper, we first present two methods that compose the Randomized Anonymization method. They perform k -anonymization and randomized response in sequence and have adequate randomness and high privacy guarantees, respectively. Then, we show the algorithm for ( ε , k ) -Randomized Anonymization, which can provide highly accurate outputs with both k -anonymity and differential privacy. In addition, we describe the analysis procedures for each method using an inverse matrix and expectation-maximization (EM) algorithm. In the experiments, we used real data to evaluate our methods’ anonymity, privacy level, and accuracy. Furthermore, we show several examples of analysis results to demonstrate high utility of the proposed methods.\",\"PeriodicalId\":20676,\"journal\":{\"name\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"volume\":\"184 6 1\",\"pages\":\"287-297\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0011665600003414\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011665600003414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
(ε, k)-Randomized Anonymization: ε-Differentially Private Data Sharing with k-Anonymity
: As the amount of biomedical and healthcare data increases, data mining for medicine becomes more and more important for health improvement. At the same time, privacy concerns in data utilization have also been growing. The key concepts for privacy protection are k -anonymity and differential privacy, but k -anonymity alone cannot protect personal presence information, and differential privacy alone would leak the identity. To promote data sharing throughout the world, universal methods to release the entire data while satisfying both concepts are required, but such a method does not yet exist. Therefore, we propose a novel privacy-preserving method, ( ε , k ) -Randomized Anonymization. In this paper, we first present two methods that compose the Randomized Anonymization method. They perform k -anonymization and randomized response in sequence and have adequate randomness and high privacy guarantees, respectively. Then, we show the algorithm for ( ε , k ) -Randomized Anonymization, which can provide highly accurate outputs with both k -anonymity and differential privacy. In addition, we describe the analysis procedures for each method using an inverse matrix and expectation-maximization (EM) algorithm. In the experiments, we used real data to evaluate our methods’ anonymity, privacy level, and accuracy. Furthermore, we show several examples of analysis results to demonstrate high utility of the proposed methods.