{"title":"Privacy Analysis of Format-Preserving Data-Masking Techniques","authors":"Zaruhi Aslanyan, M. Boesgaard","doi":"10.1109/CMI48017.2019.8962143","DOIUrl":null,"url":null,"abstract":"With the growing number of regulations and concerns regarding data privacy, there is an increasing need for protecting Personally Identifiable Information (PII). A widely-used approach to protect PII is to apply data-masking techniques in order to remove or hide the identities of the individuals referred to in the data under investigation. A particular class of data-masking techniques aims at preserving the format of the source data, so as to allow using encoded data where the corresponding source is expected, thereby minimising application changes to perform tasks such as statistical analysis or testing. Various encoding techniques are used to protect data privacy while preserving the format, including Format-Preserving Encryption (FPE) and masking out. Even though convenient, preserving the format of data might lead to re-identification attacks. In this paper, we discuss the vulnerabilities of data-masking techniques that preserve the format of data and analyse their security and privacy properties. We investigate two industrial datasets and quantify the potential data privacy leakage that could arise from using inappropriate data masking techniques.","PeriodicalId":142770,"journal":{"name":"2019 12th CMI Conference on Cybersecurity and Privacy (CMI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 12th CMI Conference on Cybersecurity and Privacy (CMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CMI48017.2019.8962143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
With the growing number of regulations and concerns regarding data privacy, there is an increasing need for protecting Personally Identifiable Information (PII). A widely-used approach to protect PII is to apply data-masking techniques in order to remove or hide the identities of the individuals referred to in the data under investigation. A particular class of data-masking techniques aims at preserving the format of the source data, so as to allow using encoded data where the corresponding source is expected, thereby minimising application changes to perform tasks such as statistical analysis or testing. Various encoding techniques are used to protect data privacy while preserving the format, including Format-Preserving Encryption (FPE) and masking out. Even though convenient, preserving the format of data might lead to re-identification attacks. In this paper, we discuss the vulnerabilities of data-masking techniques that preserve the format of data and analyse their security and privacy properties. We investigate two industrial datasets and quantify the potential data privacy leakage that could arise from using inappropriate data masking techniques.