{"title":"结合数据摄动和数据重构的私有数据保护设计","authors":"Juting Wang, Wai Kin Victor Chan","doi":"10.1145/3459104.3459193","DOIUrl":null,"url":null,"abstract":"With the rapid development of various hardware equipment and saving technology, multiple data with different types are uploaded to saving space. There are some private data can not be ignored. For provider, in order to use and deliver these private data to the third party, data anonymization, such as K-anonymity [1] should be applied to cover the explicit information. For receiver, there are still some way to transform these “fake” data to a new data set which obtain the same statistical properties with the original one while not exactly the same in detailed records. Under this condition, we want to show our work —— data perturbation and data reconstruction to deal with this problem. We use RGADP (Retrievable General Addictive Data Perturbation) [2] to produce data perturbation and EM algorithm to reconstruct data. And the results are Perturbated data can be produced by original data, and it can be delivered, reversed or further reconstructed easily. The reconstructed data still keeps the statistical properties as the original one. Compared with conditional way, this method can be more flexible to adjust the privacy protection degree according to the length of bias interval. We integrated these two process and report on the findings of our experimental evaluation.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Design for Private Data Protection Combining with Data Perturbation and Data Reconstruction\",\"authors\":\"Juting Wang, Wai Kin Victor Chan\",\"doi\":\"10.1145/3459104.3459193\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of various hardware equipment and saving technology, multiple data with different types are uploaded to saving space. There are some private data can not be ignored. For provider, in order to use and deliver these private data to the third party, data anonymization, such as K-anonymity [1] should be applied to cover the explicit information. For receiver, there are still some way to transform these “fake” data to a new data set which obtain the same statistical properties with the original one while not exactly the same in detailed records. Under this condition, we want to show our work —— data perturbation and data reconstruction to deal with this problem. We use RGADP (Retrievable General Addictive Data Perturbation) [2] to produce data perturbation and EM algorithm to reconstruct data. And the results are Perturbated data can be produced by original data, and it can be delivered, reversed or further reconstructed easily. The reconstructed data still keeps the statistical properties as the original one. Compared with conditional way, this method can be more flexible to adjust the privacy protection degree according to the length of bias interval. We integrated these two process and report on the findings of our experimental evaluation.\",\"PeriodicalId\":142284,\"journal\":{\"name\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3459104.3459193\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
随着各种硬件设备和存储技术的快速发展,不同类型的多个数据被上传,以节省空间。有一些私人数据是不容忽视的。对于提供者来说,为了将这些私有数据使用和传递给第三方,应该采用数据匿名化,如k -匿名[1]来覆盖显性信息。对于接收方来说,仍然有一些方法将这些“假”数据转换成一个新的数据集,该数据集具有与原始数据相同的统计属性,但在详细记录上并不完全相同。在这种情况下,我们要展示我们的工作——数据摄动和数据重建来处理这个问题。我们使用RGADP (Retrievable General addiction Data摄动)[2]产生数据摄动,并使用EM算法重建数据。结果表明:原始数据可以产生摄动数据,并且可以很容易地传递、反转或进一步重构。重建后的数据仍保持原始数据的统计特性。与有条件的方法相比,该方法可以更灵活地根据偏置间隔的长度来调整隐私保护程度。我们整合了这两个过程,并报告了我们的实验评估结果。
A Design for Private Data Protection Combining with Data Perturbation and Data Reconstruction
With the rapid development of various hardware equipment and saving technology, multiple data with different types are uploaded to saving space. There are some private data can not be ignored. For provider, in order to use and deliver these private data to the third party, data anonymization, such as K-anonymity [1] should be applied to cover the explicit information. For receiver, there are still some way to transform these “fake” data to a new data set which obtain the same statistical properties with the original one while not exactly the same in detailed records. Under this condition, we want to show our work —— data perturbation and data reconstruction to deal with this problem. We use RGADP (Retrievable General Addictive Data Perturbation) [2] to produce data perturbation and EM algorithm to reconstruct data. And the results are Perturbated data can be produced by original data, and it can be delivered, reversed or further reconstructed easily. The reconstructed data still keeps the statistical properties as the original one. Compared with conditional way, this method can be more flexible to adjust the privacy protection degree according to the length of bias interval. We integrated these two process and report on the findings of our experimental evaluation.