{"title":"Privacy protection against user profiling through optimal data generalization","authors":"César Gil, Javier Parra-Arnau, Jordi Forné","doi":"10.1016/j.cose.2024.104178","DOIUrl":null,"url":null,"abstract":"<div><div>Personalized information systems are information-filtering systems that endeavor to tailor information-exchange functionality to the specific interests of their users. The ability of these systems to profile users based on their search queries at Google, disclosed locations at Twitter or rated movies at Netflix, is on the one hand what enables such intelligent functionality, but on the other, the source of serious privacy concerns. Leveraging on the principle of data minimization, we propose a data-generalization mechanism that aims to protect users’ privacy against non-fully trusted personalized information systems. In our approach, a user may like to disclose personal data to such systems when they feel comfortable. But when they do not, they may wish to replace specific and sensitive data with more general and thus less sensitive data, before sharing this information with the personalized system in question. Generalization therefore may protect user privacy to a certain extent, but clearly at the cost of some information loss. In this work, we model mathematically an optimized version of this mechanism and investigate theoretically some key properties of the privacy-utility trade-off posed by this mechanism. Experimental results on two real-world datasets demonstrate how our approach may contribute to privacy protection and show it can outperform state-of-the-art perturbation techniques like data forgery and suppression by providing higher utility for a same privacy level. On a practical level, the implications of our work are diverse in the field of personalized online services. We emphasize that our mechanism allows each user individually to take charge of their own privacy, without the need to go to third parties or share resources with other users. And on the other hand, it provides privacy designers/engineers with a new data-perturbative mechanism with which to evaluate their systems in the presence of data that is likely to be generalizable according to a certain hierarchy, highlighting spatial generalization, with practical application in popular location based services. Overall, a data-perturbation mechanism for privacy protection against user profiling, which is optimal, deterministic, and local, based on a untrusted model towards third parties.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"148 ","pages":"Article 104178"},"PeriodicalIF":4.8000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824004838","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Personalized information systems are information-filtering systems that endeavor to tailor information-exchange functionality to the specific interests of their users. The ability of these systems to profile users based on their search queries at Google, disclosed locations at Twitter or rated movies at Netflix, is on the one hand what enables such intelligent functionality, but on the other, the source of serious privacy concerns. Leveraging on the principle of data minimization, we propose a data-generalization mechanism that aims to protect users’ privacy against non-fully trusted personalized information systems. In our approach, a user may like to disclose personal data to such systems when they feel comfortable. But when they do not, they may wish to replace specific and sensitive data with more general and thus less sensitive data, before sharing this information with the personalized system in question. Generalization therefore may protect user privacy to a certain extent, but clearly at the cost of some information loss. In this work, we model mathematically an optimized version of this mechanism and investigate theoretically some key properties of the privacy-utility trade-off posed by this mechanism. Experimental results on two real-world datasets demonstrate how our approach may contribute to privacy protection and show it can outperform state-of-the-art perturbation techniques like data forgery and suppression by providing higher utility for a same privacy level. On a practical level, the implications of our work are diverse in the field of personalized online services. We emphasize that our mechanism allows each user individually to take charge of their own privacy, without the need to go to third parties or share resources with other users. And on the other hand, it provides privacy designers/engineers with a new data-perturbative mechanism with which to evaluate their systems in the presence of data that is likely to be generalizable according to a certain hierarchy, highlighting spatial generalization, with practical application in popular location based services. Overall, a data-perturbation mechanism for privacy protection against user profiling, which is optimal, deterministic, and local, based on a untrusted model towards third parties.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.