Feature Creation based Slicing for Privacy Preserving Data Mining

Proceedings of the 3rd IKDD Conference on Data Science, 2016 Pub Date : 2016-03-13 DOI:10.1145/2888451.2888462

R. Priyadarsini, M. Valarmathi, S. Sivakumari

{"title":"Feature Creation based Slicing for Privacy Preserving Data Mining","authors":"R. Priyadarsini, M. Valarmathi, S. Sivakumari","doi":"10.1145/2888451.2888462","DOIUrl":null,"url":null,"abstract":"In the digital era vast amount of data are collected and shared for purpose of research and analysis. These data contain sensitive information about the people and organizations which needs to be protected during the process of data mining. This work proposes Feature Creation Based Slicing [FCBS] algorithm for preserving privacy such that sensitive data are not exposed during the process of data mining in Multi Trust Level [MTL] environment. The proposed algorithm applies three layers of privacy preservation using both perturbation and non-perturbation techniques and creates new features from already existing attribute vector. Experiments are performed on real life and benchmarked datasets and the results are compared with the existing slicing and L-diversity algorithms. The results show that privacy preserved datasets generated using the proposed algorithm yields negligible hiding failure while protecting sensitive patterns during association mining and gives comparable utility during classification. Due to feature creation process in the proposed algorithm, linking and known background attacks are prevented. Also, the variance values of the proposed privacy preserved datasets show that they can prevent diversity attacks.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2888451.2888462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In the digital era vast amount of data are collected and shared for purpose of research and analysis. These data contain sensitive information about the people and organizations which needs to be protected during the process of data mining. This work proposes Feature Creation Based Slicing [FCBS] algorithm for preserving privacy such that sensitive data are not exposed during the process of data mining in Multi Trust Level [MTL] environment. The proposed algorithm applies three layers of privacy preservation using both perturbation and non-perturbation techniques and creates new features from already existing attribute vector. Experiments are performed on real life and benchmarked datasets and the results are compared with the existing slicing and L-diversity algorithms. The results show that privacy preserved datasets generated using the proposed algorithm yields negligible hiding failure while protecting sensitive patterns during association mining and gives comparable utility during classification. Due to feature creation process in the proposed algorithm, linking and known background attacks are prevented. Also, the variance values of the proposed privacy preserved datasets show that they can prevent diversity attacks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于特征创建的隐私保护数据挖掘切片

在数字时代，为了研究和分析的目的，大量的数据被收集和共享。这些数据包含在数据挖掘过程中需要保护的人员和组织的敏感信息。本文提出了一种基于特征创建的切片(FCBS)算法来保护隐私，使得在多信任级别(MTL)环境下的数据挖掘过程中敏感数据不会暴露。该算法采用微扰和非微扰技术进行三层隐私保护，并从已有的属性向量中创建新的特征。在实际数据集和基准数据集上进行了实验，并将实验结果与现有的切片和l -分集算法进行了比较。结果表明，使用该算法生成的隐私保护数据集在关联挖掘过程中可以忽略隐藏失败，同时保护敏感模式，并且在分类过程中具有相当的效用。由于算法中的特征生成过程，避免了链接攻击和已知的后台攻击。此外，所提出的隐私保护数据集的方差值表明它们可以防止多样性攻击。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 3rd IKDD Conference on Data Science, 2016

自引率

0.00%

发文量

期刊最新文献

On the Dynamics of Username Changing Behavior on Twitter Smart filters for social retrieval Improving Urban Transportation through Social Media Analytics AMEO 2015: A dataset comprising AMCAT test scores, biodata details and employment outcomes of job seekers Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow