L-diversity: privacy beyond k-anonymity

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI:10.1145/1217299.1217302

Ashwin Machanavajjhala, J. Gehrke, Daniel Kifer, Muthuramakrishnan Venkitasubramaniam

{"title":"L-diversity: privacy beyond k-anonymity","authors":"Ashwin Machanavajjhala, J. Gehrke, Daniel Kifer, Muthuramakrishnan Venkitasubramaniam","doi":"10.1145/1217299.1217302","DOIUrl":null,"url":null,"abstract":"Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called \\kappa-anonymity has gained popularity. In a \\kappa-anonymized dataset, each record is indistinguishable from at least k—1 other records with respect to certain \"identifying\" attributes. In this paper we show with two simple attacks that a \\kappa-anonymized dataset has some subtle, but severe privacy problems. First, we show that an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. Second, attackers often have background knowledge, and we show that \\kappa-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy definition called \\ell-diversity. In addition to building a formal foundation for \\ell-diversity, we show in an experimental evaluation that \\ell-diversity is practical and can be implemented efficiently.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"44 1","pages":"24-24"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5163","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering (ICDE'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1217299.1217302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5163

Abstract

Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called \kappa-anonymity has gained popularity. In a \kappa-anonymized dataset, each record is indistinguishable from at least k—1 other records with respect to certain "identifying" attributes. In this paper we show with two simple attacks that a \kappa-anonymized dataset has some subtle, but severe privacy problems. First, we show that an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. Second, attackers often have background knowledge, and we show that \kappa-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy definition called \ell-diversity. In addition to building a formal foundation for \ell-diversity, we show in an experimental evaluation that \ell-diversity is practical and can be implemented efficiently.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

l多样性:超越k匿名的隐私

在不泄露个人敏感信息的情况下发布个人数据是一个重要问题。近年来，一种名为“kappa-匿名”的隐私新定义越来越受欢迎。在kappa匿名数据集中，每条记录在某些“识别”属性方面与至少k-1条其他记录无法区分。在本文中，我们通过两个简单的攻击表明，一个\kappa匿名数据集存在一些微妙但严重的隐私问题。首先，我们证明了当敏感属性的多样性很小时，攻击者可以发现这些敏感属性的值。其次，攻击者通常有背景知识，我们证明了\kappa匿名并不能保证隐私免受使用背景知识的攻击者的攻击。我们对这两种攻击进行了详细的分析，并提出了一种新颖而强大的隐私定义，称为well -diversity。除了为\ well -diversity建立正式的基础外，我们还通过实验评估表明\ well -diversity是可行的，并且可以有效地实施。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

22nd International Conference on Data Engineering (ICDE'06)

自引率

0.00%

发文量

期刊最新文献

An Approach to Adaptive Memory Management in Data Stream Systems Revision Processing in a Stream Processing Engine: A High-Level Design SUBSKY: Efficient Computation of Skylines in Subspaces How to Determine a Good Multi-Programming Level for External Scheduling Warehousing and Analyzing Massive RFID Data Sets