Performance Metrics Evaluation Towards The Effectiveness of Data Anonymization

A. Raj, Rio G. L. D'Souza
{"title":"Performance Metrics Evaluation Towards The Effectiveness of Data Anonymization","authors":"A. Raj, Rio G. L. D'Souza","doi":"10.1109/I2CT57861.2023.10126310","DOIUrl":null,"url":null,"abstract":"A supplementary method for ensuring that private data is inaccessible to outside parties is data anonymization. Anonymization might affect the outcomes of data mining procedures since it may make it more difficult for commonly used algorithms to analyze the data. This practical experience report compares the performance impact of current data anonymization algorithms to the suggested k-anonymization methods utilizing both original and anonymized data in order to assess the correctness and execution time. Through the use of kanonymization, l-diversity, t-closeness, and differential privacy techniques, a sample of genuine data produced by a healthcare facility was made anonymous. Contrary to predictions, the Hadoop framework was able to handle anonymization approaches, improving accuracy and performance while speeding up execution. These findings show that data anonymization techniques, when properly implemented through Hadoop ecosystems, can help to increase the effectiveness of data anonymization. Furthermore, the suggested method can produce the data anonymization with the necessary utility and protection trade-offs and with a performance scalable to large datasets.","PeriodicalId":150346,"journal":{"name":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CT57861.2023.10126310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A supplementary method for ensuring that private data is inaccessible to outside parties is data anonymization. Anonymization might affect the outcomes of data mining procedures since it may make it more difficult for commonly used algorithms to analyze the data. This practical experience report compares the performance impact of current data anonymization algorithms to the suggested k-anonymization methods utilizing both original and anonymized data in order to assess the correctness and execution time. Through the use of kanonymization, l-diversity, t-closeness, and differential privacy techniques, a sample of genuine data produced by a healthcare facility was made anonymous. Contrary to predictions, the Hadoop framework was able to handle anonymization approaches, improving accuracy and performance while speeding up execution. These findings show that data anonymization techniques, when properly implemented through Hadoop ecosystems, can help to increase the effectiveness of data anonymization. Furthermore, the suggested method can produce the data anonymization with the necessary utility and protection trade-offs and with a performance scalable to large datasets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据匿名化有效性的性能指标评价
确保私有数据不被外界访问的补充方法是数据匿名化。匿名化可能会影响数据挖掘过程的结果,因为它可能使常用算法更难分析数据。本实践经验报告比较了当前数据匿名化算法与使用原始数据和匿名数据的建议k-匿名化方法的性能影响,以评估其正确性和执行时间。通过使用匿名化、l-多样性、t-接近和差异隐私技术,医疗机构生成的真实数据样本是匿名的。与预测相反,Hadoop框架能够处理匿名化方法,提高准确性和性能,同时加快执行速度。这些发现表明,数据匿名化技术,当通过Hadoop生态系统适当实施时,可以帮助提高数据匿名化的有效性。此外,所建议的方法可以产生具有必要的实用程序和保护权衡的数据匿名化,并且具有可扩展到大型数据集的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Investigation on Impact of Partial Shading on Solar PV Array Character and Word Level Gesture Recognition of Indian Sign Language Electricity Theft Detection Employing Machine Learning Algorithms Precision Agriculture: Classifying Banana Leaf Diseases with Hybrid Deep Learning Models Multimodal Question Generation using Multimodal Adaptation Gate (MAG) and BERT-based Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1