首页 > 最新文献

Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security最新文献

英文 中文
PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps. PreCurious:无辜的预训练语言模型如何变成隐私陷阱。
Ruixuan Liu, Tianhao Wang, Yang Cao, Li Xiong

The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g. ϵ = 0.05 . Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing.

预训练和微调范式已经证明了其有效性,并已成为定制语言模型以适应各种任务的标准方法。目前,基于社区的平台提供了对各种预训练模型的轻松访问,因为任何人都可以在没有严格验证过程的情况下发布。然而,如果经过精心设计,发布的预训练模型可能会成为微调数据集的隐私陷阱。在这项工作中,我们提出了PreCurious框架来揭示新的攻击面,攻击者释放预训练的模型,并获得对最终微调模型的黑盒访问。PreCurious的目标是在微调数据集上提升成员推理和数据提取的一般隐私风险。PreCurious背后的关键直觉是操纵预训练模型的记忆阶段,并用看似合理的配置指导微调。虽然经验和理论证据表明,参数高效和差异私有的微调技术可以在微调模型上防御隐私攻击,但PreCurious展示了与在良性预训练模型上进行微调相比,以一种隐蔽的方式打破这种无懈可击的可能性。虽然DP为成员推理攻击提供了一些缓解,但通过进一步利用经过消毒的数据集,PreCurious展示了目标数据提取的潜在漏洞,即使在严格的隐私预算(例如ε = 0.05)下进行差异私有调优。因此,PreCurious向用户发出了警告,提醒他们从未知来源下载预训练模型的潜在风险,仅仅依赖教程或常识防御,即使在完美清洗之后也会发布经过消毒的数据集。
{"title":"PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps.","authors":"Ruixuan Liu, Tianhao Wang, Yang Cao, Li Xiong","doi":"10.1145/3658644.3690279","DOIUrl":"10.1145/3658644.3690279","url":null,"abstract":"<p><p>The pre-training and fine-tuning paradigm has demonstrated its effectiveness and has become the standard approach for tailoring language models to various tasks. Currently, community-based platforms offer easy access to various pre-trained models, as anyone can publish without strict validation processes. However, a released pre-trained model can be a privacy trap for fine-tuning datasets if it is carefully designed. In this work, we propose PreCurious framework to reveal the new attack surface where the attacker releases the pre-trained model and gets a black-box access to the final fine-tuned model. PreCurious aims to escalate the general privacy risk of both membership inference and data extraction on the fine-tuning dataset. The key intuition behind PreCurious is to manipulate the memorization stage of the pre-trained model and guide fine-tuning with a seemingly legitimate configuration. While empirical and theoretical evidence suggests that parameter-efficient and differentially private fine-tuning techniques can defend against privacy attacks on a fine-tuned model, PreCurious demonstrates the possibility of breaking up this invulnerability in a stealthy manner compared to fine-tuning on a benign pre-trained model. While DP provides some mitigation for membership inference attack, by further leveraging a sanitized dataset, PreCurious demonstrates potential vulnerabilities for targeted data extraction even under differentially private tuning with a strict privacy budget e.g. <math><mi>ϵ</mi> <mo>=</mo> <mn>0.05</mn></math> . Thus, PreCurious raises warnings for users on the potential risks of downloading pre-trained models from unknown sources, relying solely on tutorials or common-sense defenses, and releasing sanitized datasets even after perfect scrubbing.</p>","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"2024 ","pages":"3511-3524"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12094715/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144121559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-silo Federated Learning with Record-level Personalized Differential Privacy. 具有记录级个性化差异隐私的跨竖井联合学习。
Junxu Liu, Jian Lou, Li Xiong, Jinfei Liu, Xiaofeng Meng

Federated learning (FL) enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In this paper, we explore the uncharted territory of cross-silo FL with record-level personalized differential privacy. We devise a novel framework named rPDP-FL, employing a two-stage hybrid sampling scheme with both uniform client-level sampling and non-uniform record-level sampling to accommodate varying privacy requirements. A critical and non-trivial problem is how to determine the ideal per-record sampling probability q given the personalized privacy budget ε . We introduce a versatile solution named Simulation-CurveFitting, allowing us to uncover a significant insight into the nonlinear correlation between q and ε and derive an elegant mathematical model to tackle the problem. Our evaluation demonstrates that our solution can provide significant performance gains over the baselines that do not consider personalized privacy preservation.

通过差分隐私增强的联邦学习(FL)已经成为一种流行的方法,通过在训练过程中保护客户的贡献来更好地保护客户端数据的隐私。现有的解决方案通常为所有记录假定统一的隐私预算,并提供可能不足以满足每个记录隐私需求的“一刀切”解决方案。在本文中,我们探索了具有记录级个性化差异隐私的跨筒仓FL的未知领域。我们设计了一个名为rPDP-FL的新框架,采用两阶段混合采样方案,包括统一的客户端级采样和非统一的记录级采样,以适应不同的隐私需求。在给定个性化隐私预算ε的情况下,如何确定理想的每条记录抽样概率q是一个重要的问题。我们介绍了一个名为Simulation-CurveFitting的通用解决方案,使我们能够揭示q和ε之间非线性相关性的重要见解,并推导出一个优雅的数学模型来解决这个问题。我们的评估表明,与不考虑个性化隐私保护的基线相比,我们的解决方案可以提供显著的性能提升。
{"title":"Cross-silo Federated Learning with Record-level Personalized Differential Privacy.","authors":"Junxu Liu, Jian Lou, Li Xiong, Jinfei Liu, Xiaofeng Meng","doi":"10.1145/3658644.3670351","DOIUrl":"10.1145/3658644.3670351","url":null,"abstract":"<p><p>Federated learning (FL) enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In this paper, we explore the uncharted territory of cross-silo FL with record-level personalized differential privacy. We devise a novel framework named <i>rPDP-FL</i>, employing a two-stage hybrid sampling scheme with both uniform client-level sampling and non-uniform record-level sampling to accommodate varying privacy requirements. A critical and non-trivial problem is how to determine the ideal per-record sampling probability <math><mi>q</mi></math> given the personalized privacy budget <math><mi>ε</mi></math> . We introduce a versatile solution named <i>Simulation-CurveFitting</i>, allowing us to uncover a significant insight into the nonlinear correlation between <math><mi>q</mi></math> and <math><mi>ε</mi></math> and derive an elegant mathematical model to tackle the problem. Our evaluation demonstrates that our solution can provide significant performance gains over the baselines that do not consider personalized privacy preservation.</p>","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"2024 ","pages":"303-317"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12241667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144610417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Danger of Minimum Exposures: Understanding Cross-App Information Leaks on iOS through Multi-Side-Channel Learning. 最小暴露的危险:通过多侧渠道学习了解 iOS 上的跨应用程序信息泄露。
Zihao Wang, Jiale Guan, XiaoFeng Wang, Wenhao Wang, Luyi Xing, Fares Alharbi

Research on side-channel leaks has long been focusing on the information exposure from a single channel (memory, network traffic, power, etc.). Less studied is the risk of learning from multiple side channels related to a target activity (e.g., website visits) even when individual channels are not informative enough for an effective attack. Although the prior research made the first step on this direction, inferring the operations of foreground apps on iOS from a set of global statistics, still less clear are how to determine the maximum information leaks from all target-related side channels on a system, what can be learnt about the target from such leaks and most importantly, how to control information leaks from the whole system, not just from an individual channel. To answer these fundamental questions, we performed the first systematic study on multi-channel inference, focusing on iOS as the first step. Our research is based upon a novel attack technique, called Mischief, which given a set of potential side channels related to a target activity (e.g., foreground apps), utilizes probabilistic search to approximate an optimal subset of the channels exposing most information, as measured by Merit Score, a metric for correlation-based feature selection. On such an optimal subset, an inference attack is modeled as a multivariate time series classification problem, so the state-of-the-art deep-learning based solution, InceptionTime in particular, can be applied to achieve the best possible outcome. Mischief is found to work effectively on today's iOS (16.2), identifying foreground apps, website visits, sensitive IoT operations (e.g., opening the door) with a high confidence, even in an open-world scenario, which demonstrates that the protection Apple puts in place against the known attack is inadequate. Also importantly, this new understanding enables us to develop more comprehensive protection, which could elevate today's side-channel research from suppressing leaks from individual channels to controlling information exposure across the whole system.

长期以来,有关侧信道泄漏的研究主要集中在单一信道(内存、网络流量、电源等)的信息暴露上。研究较少的是与目标活动(如网站访问)相关的多个侧信道的学习风险,即使单个信道的信息量不足以进行有效攻击。虽然先前的研究在这个方向上迈出了第一步,即从一组全局统计数据中推断出 iOS 上前台应用程序的操作,但如何确定系统上所有与目标相关的侧信道的最大信息泄露量、从这些泄露中可以了解到目标的哪些信息,以及最重要的是如何控制来自整个系统的信息泄露,而不仅仅是来自单个信道的信息泄露,这些问题仍然不太清楚。为了回答这些基本问题,我们首次对多通道推理进行了系统性研究,并将 iOS 作为研究的第一步。我们的研究基于一种名为 "恶作剧"(Mischief)的新型攻击技术,该技术给定了一组与目标活动(如前台应用程序)相关的潜在侧信道,利用概率搜索来近似确定暴露信息最多的信道的最佳子集,该子集由基于相关性特征选择的指标 "优点分数"(Merit Score)来衡量。在这样一个最优子集上,推理攻击被建模为一个多变量时间序列分类问题,因此基于深度学习的最先进解决方案,特别是 InceptionTime,可以应用于实现最佳结果。研究发现,"恶作剧 "能在当今的 iOS 系统(16.2)上有效工作,即使在开放世界场景中,也能以较高的置信度识别前台应用程序、网站访问、敏感的物联网操作(如开门),这表明苹果公司针对已知攻击所采取的保护措施并不充分。同样重要的是,这种新的认识使我们能够开发出更全面的保护措施,从而将当今的侧信道研究从抑制单个信道的泄漏提升到控制整个系统的信息暴露。
{"title":"The Danger of Minimum Exposures: Understanding Cross-App Information Leaks on iOS through Multi-Side-Channel Learning.","authors":"Zihao Wang, Jiale Guan, XiaoFeng Wang, Wenhao Wang, Luyi Xing, Fares Alharbi","doi":"10.1145/3576915.3616655","DOIUrl":"10.1145/3576915.3616655","url":null,"abstract":"<p><p>Research on side-channel leaks has long been focusing on the information exposure from a single channel (memory, network traffic, power, etc.). Less studied is the risk of learning from multiple side channels related to a target activity (e.g., website visits) even when individual channels are not informative enough for an effective attack. Although the prior research made the first step on this direction, inferring the operations of foreground apps on iOS from a set of global statistics, still less clear are how to determine the maximum information leaks from all target-related side channels on a system, what can be learnt about the target from such leaks and most importantly, how to control information leaks from the whole system, not just from an individual channel. To answer these fundamental questions, we performed the first systematic study on multi-channel inference, focusing on iOS as the first step. Our research is based upon a novel attack technique, called Mischief, which given a set of potential side channels related to a target activity (e.g., foreground apps), utilizes probabilistic search to approximate an optimal subset of the channels exposing most information, as measured by Merit Score, a metric for correlation-based feature selection. On such an optimal subset, an inference attack is modeled as a multivariate time series classification problem, so the state-of-the-art deep-learning based solution, InceptionTime in particular, can be applied to achieve the best possible outcome. Mischief is found to work effectively on today's iOS (16.2), identifying foreground apps, website visits, sensitive IoT operations (e.g., opening the door) with a high confidence, even in an open-world scenario, which demonstrates that the protection Apple puts in place against the known attack is inadequate. Also importantly, this new understanding enables us to develop more comprehensive protection, which could elevate today's side-channel research from suppressing leaks from individual channels to controlling information exposure across the whole system.</p>","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"2023 ","pages":"281-295"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11466504/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WristPrint: Characterizing User Re-identification Risks from Wrist-worn Accelerometry Data. 腕印:从佩戴在手腕上的加速度计数据中描述用户重新识别的风险。
Nazir Saleheen, Md Azim Ullah, Supriyo Chakraborty, Deniz S Ones, Mani Srivastava, Santosh Kumar

Public release of wrist-worn motion sensor data is growing. They enable and accelerate research in developing new algorithms to passively track daily activities, resulting in improved health and wellness utilities of smartwatches and activity trackers. But, when combined with sensitive attribute inference attack and linkage attack via re-identification of the same user in multiple datasets, undisclosed sensitive attributes can be revealed to unintended organizations with potentially adverse consequences for unsuspecting data contributing users. To guide both users and data collecting researchers, we characterize the re-identification risks inherent in motion sensor data collected from wrist-worn devices in users' natural environment. For this purpose, we use an open-set formulation, train a deep learning architecture with a new loss function, and apply our model to a new data set consisting of 10 weeks of daily sensor wearing by 353 users. We find that re-identification risk increases with an increase in the activity intensity. On average, such risk is 96% for a user when sharing a full day of sensor data.

公开发布的腕带运动传感器数据越来越多。它们促进并加速了开发新算法的研究,以被动地跟踪日常活动,从而改善了智能手表和活动追踪器的健康和保健功能。但是,当与敏感属性推理攻击和通过在多个数据集中重新识别同一用户的链接攻击相结合时,未公开的敏感属性可能会向无意的组织透露,从而对毫无防备的数据贡献用户产生潜在的不利后果。为了指导用户和数据收集研究人员,我们描述了在用户自然环境中从腕带设备收集的运动传感器数据所固有的重新识别风险。为此,我们使用开放集公式,训练具有新损失函数的深度学习架构,并将我们的模型应用于由353名用户每天佩戴传感器10周组成的新数据集。我们发现,重新识别风险随着活动强度的增加而增加。平均而言,用户共享一整天的传感器数据时,这种风险为96%。
{"title":"WristPrint: Characterizing User Re-identification Risks from Wrist-worn Accelerometry Data.","authors":"Nazir Saleheen,&nbsp;Md Azim Ullah,&nbsp;Supriyo Chakraborty,&nbsp;Deniz S Ones,&nbsp;Mani Srivastava,&nbsp;Santosh Kumar","doi":"10.1145/3460120.3484799","DOIUrl":"https://doi.org/10.1145/3460120.3484799","url":null,"abstract":"<p><p>Public release of wrist-worn motion sensor data is growing. They enable and accelerate research in developing new algorithms to passively track daily activities, resulting in improved health and wellness utilities of smartwatches and activity trackers. But, when combined with sensitive attribute inference attack and linkage attack via re-identification of the same user in multiple datasets, undisclosed sensitive attributes can be revealed to unintended organizations with potentially adverse consequences for unsuspecting data contributing users. To guide both users and data collecting researchers, we characterize the re-identification risks inherent in motion sensor data collected from wrist-worn devices in users' natural environment. For this purpose, we use an open-set formulation, train a deep learning architecture with a new loss function, and apply our model to a new data set consisting of 10 weeks of daily sensor wearing by 353 users. We find that re-identification risk increases with an increase in the activity intensity. On average, such risk is 96% for a user when sharing a full day of sensor data.</p>","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"2021 ","pages":"2807-2823"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9988376/pdf/nihms-1839082.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9080865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
WAHC '21: Proceedings of the 9th on Workshop on Encrypted Computing & Applied Homomorphic Cryptography, Virtual Event, Korea, 15 November 2021 WAHC '21:第九届加密计算与应用同态密码学研讨会论文集,虚拟事件,韩国,2021年11月15日
{"title":"WAHC '21: Proceedings of the 9th on Workshop on Encrypted Computing & Applied Homomorphic Cryptography, Virtual Event, Korea, 15 November 2021","authors":"","doi":"10.1145/3474366","DOIUrl":"https://doi.org/10.1145/3474366","url":null,"abstract":"","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83754492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15 - 19, 2021 CCS '21: 2021 ACM SIGSAC计算机和通信安全会议,虚拟事件,大韩民国,2021年11月15日至19日
{"title":"CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual Event, Republic of Korea, November 15 - 19, 2021","authors":"","doi":"10.1145/3460120","DOIUrl":"https://doi.org/10.1145/3460120","url":null,"abstract":"","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"154 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78899013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Incremental Learning Algorithm of Data Complexity Based on KNN Classifier 基于KNN分类器的数据复杂度增量学习算法
Jie Li, Yaxu Xue, Yadong Xu
{"title":"Incremental Learning Algorithm of Data Complexity Based on KNN Classifier","authors":"Jie Li, Yaxu Xue, Yadong Xu","doi":"10.1109/CcS49175.2020.9231514","DOIUrl":"https://doi.org/10.1109/CcS49175.2020.9231514","url":null,"abstract":"","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"53 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77498955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to Accurately and Privately Identify Anomalies. 如何准确、私密地识别异常。
Hafiz Asif, Periklis A Papakonstantinou, Jaideep Vaidya

Identifying anomalies in data is central to the advancement of science, national security, and finance. However, privacy concerns restrict our ability to analyze data. Can we lift these restrictions and accurately identify anomalies without hurting the privacy of those who contribute their data? We address this question for the most practically relevant case, where a record is considered anomalous relative to other records. We make four contributions. First, we introduce the notion of sensitive privacy, which conceptualizes what it means to privately identify anomalies. Sensitive privacy generalizes the important concept of differential privacy and is amenable to analysis. Importantly, sensitive privacy admits algorithmic constructions that provide strong and practically meaningful privacy and utility guarantees. Second, we show that differential privacy is inherently incapable of accurately and privately identifying anomalies; in this sense, our generalization is necessary. Third, we provide a general compiler that takes as input a differentially private mechanism (which has bad utility for anomaly identification) and transforms it into a sensitively private one. This compiler, which is mostly of theoretical importance, is shown to output a mechanism whose utility greatly improves over the utility of the input mechanism. As our fourth contribution we propose mechanisms for a popular definition of anomaly ((β, r)-anomaly) that (i) are guaranteed to be sensitively private, (ii) come with provable utility guarantees, and (iii) are empirically shown to have an overwhelmingly accurate performance over a range of datasets and evaluation criteria.

识别数据中的异常对于科学、国家安全和金融的发展至关重要。然而,隐私问题限制了我们分析数据的能力。我们能否解除这些限制,在不损害那些提供数据的人的隐私的情况下准确地识别异常情况?我们为最实际相关的情况解决这个问题,其中一个记录被认为是相对于其他记录异常。我们有四个贡献。首先,我们引入了敏感隐私的概念,它将私下识别异常的含义概念化。敏感隐私是差分隐私这一重要概念的概括,是易于分析的。重要的是,敏感隐私允许算法结构提供强大的和实际有意义的隐私和效用保证。其次,我们表明差分隐私本质上无法准确和私密地识别异常;从这个意义上说,我们的概括是必要的。第三,我们提供了一个通用编译器,该编译器将差分私有机制(对异常识别的效用较差)作为输入,并将其转换为敏感私有机制。这个编译器主要具有理论重要性,它输出的机制的效用大大提高了输入机制的效用。作为我们的第四个贡献,我们提出了异常((β, r)-异常)的流行定义机制,(i)保证敏感私有,(ii)具有可证明的效用保证,以及(iii)经验证明在一系列数据集和评估标准上具有绝对准确的性能。
{"title":"How to Accurately and Privately Identify Anomalies.","authors":"Hafiz Asif,&nbsp;Periklis A Papakonstantinou,&nbsp;Jaideep Vaidya","doi":"10.1145/3319535.3363209","DOIUrl":"https://doi.org/10.1145/3319535.3363209","url":null,"abstract":"<p><p>Identifying anomalies in data is central to the advancement of science, national security, and finance. However, privacy concerns restrict our ability to analyze data. Can we lift these restrictions and accurately identify anomalies without hurting the privacy of those who contribute their data? We address this question for the most practically relevant case, where a record is considered anomalous relative to other records. We make four contributions. First, we introduce the notion of sensitive privacy, which conceptualizes what it means to privately identify anomalies. Sensitive privacy generalizes the important concept of differential privacy and is amenable to analysis. Importantly, sensitive privacy admits algorithmic constructions that provide strong and practically meaningful privacy and utility guarantees. Second, we show that differential privacy is inherently incapable of accurately and privately identifying anomalies; in this sense, our generalization is necessary. Third, we provide a general compiler that takes as input a differentially private mechanism (which has bad utility for anomaly identification) and transforms it into a sensitively private one. This compiler, which is mostly of theoretical importance, is shown to output a mechanism whose utility greatly improves over the utility of the input mechanism. As our fourth contribution we propose mechanisms for a popular definition of anomaly ((<i>β</i>, <i>r</i>)-anomaly) that (i) are guaranteed to be sensitively private, (ii) come with provable utility guarantees, and (iii) are empirically shown to have an overwhelmingly accurate performance over a range of datasets and evaluation criteria.</p>","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"2019 ","pages":"719-736"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3319535.3363209","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37486357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Session details: Session 8A: Web Security 1 会话详细信息:会话8A: Web Security
Adam Session Chair-Doupé
{"title":"Session details: Session 8A: Web Security 1","authors":"Adam Session Chair-Doupé","doi":"10.1145/3285889","DOIUrl":"https://doi.org/10.1145/3285889","url":null,"abstract":"","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73231135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 6A: IoT Security 会议详情:会议6A:物联网安全
Yuqiong Session Chair-Sun
{"title":"Session details: Session 6A: IoT Security","authors":"Yuqiong Session Chair-Sun","doi":"10.1145/3285881","DOIUrl":"https://doi.org/10.1145/3285881","url":null,"abstract":"","PeriodicalId":72687,"journal":{"name":"Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91030899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Conference on Computer and Communications Security : proceedings of the ... conference on computer and communications security. ACM Conference on Computer and Communications Security
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1