{"title":"Reducing Amount of Information Loss in k-Anonymization for Secondary Use of Collected Personal Information","authors":"Kunihiko Harada, Yoshinori Sato, Yumiko Togashi","doi":"10.1109/SRII.2012.18","DOIUrl":null,"url":null,"abstract":"A lot of information has recently been collected and the need to put it to secondary use is expanding. This is because a lot of useful knowledge is contained in it. There are always privacy concerns with the secondary use of personal information. k-anonymization is a tool that enables us to release personal information in a manner that is privacy-protected. In classical k-anonymization, side information, which is termed generalization hierarchies, is always needed. In addition, the quality of k-anonymized data has always been a central problem in the area because information loss is an inherent feature of anonymization. This paper proposes a new scheme in which generalization hierarchies are automatically constructed by input information. This scheme not only contributes to reducing the cost of operations for preparing side information, but also to increasing the quality of k-anonymization results. Experiments have demonstrated that k-anonymization with automatically constructed hierarchies sacrifices 38% less data (measured by information entropy) than that with complete binary trees (introduced as classically-used hierarchies).","PeriodicalId":110778,"journal":{"name":"2012 Annual SRII Global Conference","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Annual SRII Global Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRII.2012.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
A lot of information has recently been collected and the need to put it to secondary use is expanding. This is because a lot of useful knowledge is contained in it. There are always privacy concerns with the secondary use of personal information. k-anonymization is a tool that enables us to release personal information in a manner that is privacy-protected. In classical k-anonymization, side information, which is termed generalization hierarchies, is always needed. In addition, the quality of k-anonymized data has always been a central problem in the area because information loss is an inherent feature of anonymization. This paper proposes a new scheme in which generalization hierarchies are automatically constructed by input information. This scheme not only contributes to reducing the cost of operations for preparing side information, but also to increasing the quality of k-anonymization results. Experiments have demonstrated that k-anonymization with automatically constructed hierarchies sacrifices 38% less data (measured by information entropy) than that with complete binary trees (introduced as classically-used hierarchies).