Feature Creation based Slicing for Privacy Preserving Data Mining

R. Priyadarsini, M. Valarmathi, S. Sivakumari
{"title":"Feature Creation based Slicing for Privacy Preserving Data Mining","authors":"R. Priyadarsini, M. Valarmathi, S. Sivakumari","doi":"10.1145/2888451.2888462","DOIUrl":null,"url":null,"abstract":"In the digital era vast amount of data are collected and shared for purpose of research and analysis. These data contain sensitive information about the people and organizations which needs to be protected during the process of data mining. This work proposes Feature Creation Based Slicing [FCBS] algorithm for preserving privacy such that sensitive data are not exposed during the process of data mining in Multi Trust Level [MTL] environment. The proposed algorithm applies three layers of privacy preservation using both perturbation and non-perturbation techniques and creates new features from already existing attribute vector. Experiments are performed on real life and benchmarked datasets and the results are compared with the existing slicing and L-diversity algorithms. The results show that privacy preserved datasets generated using the proposed algorithm yields negligible hiding failure while protecting sensitive patterns during association mining and gives comparable utility during classification. Due to feature creation process in the proposed algorithm, linking and known background attacks are prevented. Also, the variance values of the proposed privacy preserved datasets show that they can prevent diversity attacks.","PeriodicalId":136431,"journal":{"name":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd IKDD Conference on Data Science, 2016","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2888451.2888462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In the digital era vast amount of data are collected and shared for purpose of research and analysis. These data contain sensitive information about the people and organizations which needs to be protected during the process of data mining. This work proposes Feature Creation Based Slicing [FCBS] algorithm for preserving privacy such that sensitive data are not exposed during the process of data mining in Multi Trust Level [MTL] environment. The proposed algorithm applies three layers of privacy preservation using both perturbation and non-perturbation techniques and creates new features from already existing attribute vector. Experiments are performed on real life and benchmarked datasets and the results are compared with the existing slicing and L-diversity algorithms. The results show that privacy preserved datasets generated using the proposed algorithm yields negligible hiding failure while protecting sensitive patterns during association mining and gives comparable utility during classification. Due to feature creation process in the proposed algorithm, linking and known background attacks are prevented. Also, the variance values of the proposed privacy preserved datasets show that they can prevent diversity attacks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于特征创建的隐私保护数据挖掘切片
在数字时代,为了研究和分析的目的,大量的数据被收集和共享。这些数据包含在数据挖掘过程中需要保护的人员和组织的敏感信息。本文提出了一种基于特征创建的切片(FCBS)算法来保护隐私,使得在多信任级别(MTL)环境下的数据挖掘过程中敏感数据不会暴露。该算法采用微扰和非微扰技术进行三层隐私保护,并从已有的属性向量中创建新的特征。在实际数据集和基准数据集上进行了实验,并将实验结果与现有的切片和l -分集算法进行了比较。结果表明,使用该算法生成的隐私保护数据集在关联挖掘过程中可以忽略隐藏失败,同时保护敏感模式,并且在分类过程中具有相当的效用。由于算法中的特征生成过程,避免了链接攻击和已知的后台攻击。此外,所提出的隐私保护数据集的方差值表明它们可以防止多样性攻击。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On the Dynamics of Username Changing Behavior on Twitter Smart filters for social retrieval Improving Urban Transportation through Social Media Analytics AMEO 2015: A dataset comprising AMCAT test scores, biodata details and employment outcomes of job seekers Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1