首页 > 最新文献

Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium最新文献

英文 中文
Privacy-Preserving and Efficient Verification of the Outcome in Genome-Wide Association Studies. 全基因组关联研究结果的隐私保护与高效验证
Anisa Halimi, Leonard Dervishi, Erman Ayday, Apostolos Pyrgelis, Juan Ramón Troncoso-Pastoriza, Jean-Pierre Hubaux, Xiaoqian Jiang, Jaideep Vaidya

Providing provenance in scientific workflows is essential for reproducibility and auditability purposes. In this work, we propose a framework that verifies the correctness of the aggregate statistics obtained as a result of a genome-wide association study (GWAS) conducted by a researcher while protecting individuals' privacy in the researcher's dataset. In GWAS, the goal of the researcher is to identify highly associated point mutations (variants) with a given phenotype. The researcher publishes the workflow of the conducted study, its output, and associated metadata. They keep the research dataset private while providing, as part of the metadata, a partial noisy dataset (that achieves local differential privacy). To check the correctness of the workflow output, a verifier makes use of the workflow, its metadata, and results of another GWAS (conducted using publicly available datasets) to distinguish between correct statistics and incorrect ones. For evaluation, we use real genomic data and show that the correctness of the workflow output can be verified with high accuracy even when the aggregate statistics of a small number of variants are provided. We also quantify the privacy leakage due to the provided workflow and its associated metadata and show that the additional privacy risk due to the provided metadata does not increase the existing privacy risk due to sharing of the research results. Thus, our results show that the workflow output (i.e., research results) can be verified with high confidence in a privacy-preserving way. We believe that this work will be a valuable step towards providing provenance in a privacy-preserving way while providing guarantees to the users about the correctness of the results.

在科学工作流程中提供出处对于可重复性和可审计性至关重要。在这项工作中,我们提出了一个框架,用于验证研究人员进行全基因组关联研究(GWAS)后获得的综合统计数据的正确性,同时保护研究人员数据集中的个人隐私。在全基因组关联研究中,研究人员的目标是找出与给定表型高度关联的点突变(变异)。研究人员公布所进行研究的工作流程、研究结果和相关元数据。他们将研究数据集保密,同时作为元数据的一部分,提供部分噪声数据集(实现局部差异保密)。为了检查工作流输出的正确性,验证者利用工作流、其元数据和另一个 GWAS(使用公开可用的数据集进行)的结果来区分正确的统计数据和错误的统计数据。为了进行评估,我们使用了真实的基因组数据,结果表明,即使提供的是少量变异的总体统计数据,也能高精度地验证工作流输出的正确性。我们还量化了所提供的工作流及其相关元数据造成的隐私泄露,结果表明,所提供元数据造成的额外隐私风险不会增加因共享研究成果而产生的现有隐私风险。因此,我们的结果表明,工作流输出(即研究成果)可以通过保护隐私的方式进行高可信度验证。我们相信,这项工作将在以保护隐私的方式提供出处方面迈出宝贵的一步,同时为用户提供结果正确性的保证。
{"title":"Privacy-Preserving and Efficient Verification of the Outcome in Genome-Wide Association Studies.","authors":"Anisa Halimi, Leonard Dervishi, Erman Ayday, Apostolos Pyrgelis, Juan Ramón Troncoso-Pastoriza, Jean-Pierre Hubaux, Xiaoqian Jiang, Jaideep Vaidya","doi":"10.56553/popets-2022-0094","DOIUrl":"10.56553/popets-2022-0094","url":null,"abstract":"<p><p>Providing provenance in scientific workflows is essential for reproducibility and auditability purposes. In this work, we propose a framework that verifies the correctness of the aggregate statistics obtained as a result of a genome-wide association study (GWAS) conducted by a researcher while protecting individuals' privacy in the researcher's dataset. In GWAS, the goal of the researcher is to identify highly associated point mutations (variants) with a given phenotype. The researcher publishes the workflow of the conducted study, its output, and associated metadata. They keep the research dataset private while providing, as part of the metadata, a partial noisy dataset (that achieves local differential privacy). To check the correctness of the workflow output, a verifier makes use of the workflow, its metadata, and results of another GWAS (conducted using publicly available datasets) to distinguish between correct statistics and incorrect ones. For evaluation, we use real genomic data and show that the correctness of the workflow output can be verified with high accuracy even when the aggregate statistics of a small number of variants are provided. We also quantify the privacy leakage due to the provided workflow and its associated metadata and show that the additional privacy risk due to the provided metadata does not increase the existing privacy risk due to sharing of the research results. Thus, our results show that the workflow output (i.e., research results) can be verified with high confidence in a privacy-preserving way. We believe that this work will be a valuable step towards providing provenance in a privacy-preserving way while providing guarantees to the users about the correctness of the results.</p>","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9536480/pdf/nihms-1802603.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33517178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FP-Radar: Longitudinal Measurement and Early Detection of Browser Fingerprinting FP-Radar:纵向测量和浏览器指纹的早期检测
Pouneh Nikkhah Bahrami, Umar Iqbal, Zubair Shafiq
Abstract Browser fingerprinting is a stateless tracking technique that aims to combine information exposed by multiple different web APIs to create a unique identifier for tracking users across the web. Over the last decade, trackers have abused several existing and newly proposed web APIs to further enhance the browser fingerprint. Existing approaches are limited to detecting a specific fingerprinting technique(s) at a particular point in time. Thus, they are unable to systematically detect novel fingerprinting techniques that abuse different web APIs. In this paper, we propose FP-Radar, a machine learning approach that leverages longitudinal measurements of web API usage on top-100K websites over the last decade for early detection of new and evolving browser fingerprinting techniques. The results show that FP-Radar is able to early detect the abuse of newly introduced properties of already known (e.g., WebGL, Sensor) and as well as previously unknown (e.g., Gamepad, Clipboard) APIs for browser fingerprinting. To the best of our knowledge, FP-Radar is the first to detect the abuse of the Visibility API for ephemeral fingerprinting in the wild.
浏览器指纹是一种无状态跟踪技术,旨在将多个不同的web api暴露的信息组合在一起,以创建一个唯一的标识符,用于跟踪web上的用户。在过去的十年里,追踪器滥用了几个现有的和新提出的web api来进一步增强浏览器指纹。现有的方法仅限于在特定时间点检测特定的指纹技术。因此,他们无法系统地检测滥用不同web api的新型指纹技术。在本文中,我们提出了FP-Radar,这是一种机器学习方法,它利用过去十年中top-100K网站的web API使用情况的纵向测量来早期检测新的和不断发展的浏览器指纹技术。结果表明,FP-Radar能够早期检测到新引入的已知属性(例如,WebGL, Sensor)以及以前未知的浏览器指纹api(例如,Gamepad, Clipboard)的滥用。据我们所知,FP-Radar是第一个检测到在野外滥用可见性API的短暂指纹。
{"title":"FP-Radar: Longitudinal Measurement and Early Detection of Browser Fingerprinting","authors":"Pouneh Nikkhah Bahrami, Umar Iqbal, Zubair Shafiq","doi":"10.2478/popets-2022-0056","DOIUrl":"https://doi.org/10.2478/popets-2022-0056","url":null,"abstract":"Abstract Browser fingerprinting is a stateless tracking technique that aims to combine information exposed by multiple different web APIs to create a unique identifier for tracking users across the web. Over the last decade, trackers have abused several existing and newly proposed web APIs to further enhance the browser fingerprint. Existing approaches are limited to detecting a specific fingerprinting technique(s) at a particular point in time. Thus, they are unable to systematically detect novel fingerprinting techniques that abuse different web APIs. In this paper, we propose FP-Radar, a machine learning approach that leverages longitudinal measurements of web API usage on top-100K websites over the last decade for early detection of new and evolving browser fingerprinting techniques. The results show that FP-Radar is able to early detect the abuse of newly introduced properties of already known (e.g., WebGL, Sensor) and as well as previously unknown (e.g., Gamepad, Clipboard) APIs for browser fingerprinting. To the best of our knowledge, FP-Radar is the first to detect the abuse of the Visibility API for ephemeral fingerprinting in the wild.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45750502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
SoK: Plausibly Deniable Storage SoK:看似可否认的存储
Chen Chen, Xiao Liang, Bogdan Carbunar, R. Sion
Abstract Data privacy is critical in instilling trust and empowering the societal pacts of modern technology-driven democracies. Unfortunately it is under continuous attack by overreaching or outright oppressive governments, including some of the world’s oldest democracies. Increasingly-intrusive anti-encryption laws severely limit the ability of standard encryption to protect privacy. New defense mechanisms are needed. Plausible deniability (PD) is a powerful property, enabling users to hide the existence of sensitive information in a system under direct inspection by adversaries. Popular encrypted storage systems such as TrueCrypt and other research efforts have attempted to also provide plausible deniability. Unfortunately, these efforts have often operated under less well-defined assumptions and adversarial models. Careful analyses often uncover not only high overheads but also outright security compromise. Further, our understanding of adversaries, the underlying storage technologies, as well as the available plausible deniable solutions have evolved dramatically in the past two decades. The main goal of this work is to systematize this knowledge. It aims to: (1) identify key PD properties, requirements and approaches; (2) present a direly-needed unified framework for evaluating security and performance; (3) explore the challenges arising from the critical interplay between PD and modern system layered stacks; (4) propose a new “trace-oriented” PD paradigm, able to decouple security guarantees from the underlying systems and thus ensure a higher level of flexibility and security independent of the technology stack. This work is meant also as a trusted guide for system and security practitioners around the major challenges in understanding, designing and implementing plausible deniability into new or existing systems.
数据隐私对于灌输信任和增强现代技术驱动的民主社会契约至关重要。不幸的是,它不断受到越权或完全压迫的政府的攻击,包括世界上一些最古老的民主国家。越来越具有侵入性的反加密法律严重限制了标准加密保护隐私的能力。我们需要新的防御机制。可信否认(PD)是一种强大的特性,使用户能够在对手直接检查的情况下隐藏系统中敏感信息的存在。流行的加密存储系统,如TrueCrypt和其他研究也试图提供合理的否认。不幸的是,这些努力往往是在不太明确的假设和对抗性模型下进行的。仔细的分析往往不仅会发现高昂的开销,还会发现彻底的安全隐患。此外,在过去二十年中,我们对对手、底层存储技术以及可用的可信的可否认解决方案的理解已经发生了巨大的变化。这项工作的主要目标是将这些知识系统化。其目的是:(1)确定关键的PD属性、要求和方法;(2)提出了一个迫切需要的安全性和性能评估的统一框架;(3)探索PD与现代系统分层堆栈之间的关键相互作用所带来的挑战;(4)提出了一种新的“面向跟踪”的PD范式,能够将安全保证与底层系统解耦,从而确保独立于技术堆栈的更高级别的灵活性和安全性。这项工作还意味着作为系统和安全从业人员在理解、设计和实现新系统或现有系统的合理可否认性方面的主要挑战的可信指南。
{"title":"SoK: Plausibly Deniable Storage","authors":"Chen Chen, Xiao Liang, Bogdan Carbunar, R. Sion","doi":"10.2478/popets-2022-0039","DOIUrl":"https://doi.org/10.2478/popets-2022-0039","url":null,"abstract":"Abstract Data privacy is critical in instilling trust and empowering the societal pacts of modern technology-driven democracies. Unfortunately it is under continuous attack by overreaching or outright oppressive governments, including some of the world’s oldest democracies. Increasingly-intrusive anti-encryption laws severely limit the ability of standard encryption to protect privacy. New defense mechanisms are needed. Plausible deniability (PD) is a powerful property, enabling users to hide the existence of sensitive information in a system under direct inspection by adversaries. Popular encrypted storage systems such as TrueCrypt and other research efforts have attempted to also provide plausible deniability. Unfortunately, these efforts have often operated under less well-defined assumptions and adversarial models. Careful analyses often uncover not only high overheads but also outright security compromise. Further, our understanding of adversaries, the underlying storage technologies, as well as the available plausible deniable solutions have evolved dramatically in the past two decades. The main goal of this work is to systematize this knowledge. It aims to: (1) identify key PD properties, requirements and approaches; (2) present a direly-needed unified framework for evaluating security and performance; (3) explore the challenges arising from the critical interplay between PD and modern system layered stacks; (4) propose a new “trace-oriented” PD paradigm, able to decouple security guarantees from the underlying systems and thus ensure a higher level of flexibility and security independent of the technology stack. This work is meant also as a trusted guide for system and security practitioners around the major challenges in understanding, designing and implementing plausible deniability into new or existing systems.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42768085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Improving Code Stylometry Analysis in Underground Forums 改进地下论坛中的代码样式分析
Michal Tereszkowski-Kaminski, S. Pastrana, Jorge Blasco, Guillermo Suarez-Tangil
Abstract Code Stylometry has emerged as a powerful mechanism to identify programmers. While there have been significant advances in the field, existing mechanisms underperform in challenging domains. One such domain is studying the provenance of code shared in underground forums, where code posts tend to have small or incomplete source code fragments. This paper proposes a method designed to deal with the idiosyncrasies of code snippets shared in these forums. Our system fuses a forum-specific learning pipeline with Conformal Prediction to generate predictions with precise confidence levels as a novelty. We see that identifying unreliable code snippets is paramount to generate high-accuracy predictions, and this is a task where traditional learning settings fail. Overall, our method performs as twice as well as the state-of-the-art in a constrained setting with a large number of authors (i.e., 100). When dealing with a smaller number of authors (i.e., 20), it performs at high accuracy (89%). We also evaluate our work on an open-world assumption and see that our method is more effective at retaining samples.
代码风格已经成为一种识别程序员的强大机制。虽然该领域已经取得了重大进展,但现有机制在具有挑战性的领域表现不佳。其中一个领域是研究地下论坛中共享的代码的来源,那里的代码帖子往往有小的或不完整的源代码片段。本文提出了一种方法来处理这些论坛中共享的代码片段的特性。我们的系统将论坛特定的学习管道与保形预测融合在一起,以产生具有精确置信度的预测。我们看到,识别不可靠的代码片段对于生成高精度预测至关重要,这是传统学习设置失败的任务。总的来说,我们的方法在有大量作者(即100人)的约束设置下的性能是最先进方法的两倍。当处理较少数量的作者(例如,20)时,它的准确率很高(89%)。我们还在开放世界假设下评估了我们的工作,并发现我们的方法在保留样本方面更有效。
{"title":"Towards Improving Code Stylometry Analysis in Underground Forums","authors":"Michal Tereszkowski-Kaminski, S. Pastrana, Jorge Blasco, Guillermo Suarez-Tangil","doi":"10.2478/popets-2022-0007","DOIUrl":"https://doi.org/10.2478/popets-2022-0007","url":null,"abstract":"Abstract Code Stylometry has emerged as a powerful mechanism to identify programmers. While there have been significant advances in the field, existing mechanisms underperform in challenging domains. One such domain is studying the provenance of code shared in underground forums, where code posts tend to have small or incomplete source code fragments. This paper proposes a method designed to deal with the idiosyncrasies of code snippets shared in these forums. Our system fuses a forum-specific learning pipeline with Conformal Prediction to generate predictions with precise confidence levels as a novelty. We see that identifying unreliable code snippets is paramount to generate high-accuracy predictions, and this is a task where traditional learning settings fail. Overall, our method performs as twice as well as the state-of-the-art in a constrained setting with a large number of authors (i.e., 100). When dealing with a smaller number of authors (i.e., 20), it performs at high accuracy (89%). We also evaluate our work on an open-world assumption and see that our method is more effective at retaining samples.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42962108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Polaris: Transparent Succinct Zero-Knowledge Arguments for R1CS with Efficient Verifier Polaris:R1CS的透明简洁零知识自变量与高效验证器
Shihui Fu, G. Gong
Abstract We present a new zero-knowledge succinct argument of knowledge (zkSNARK) scheme for Rank-1 Constraint Satisfaction (RICS), a widely deployed NP-complete language that generalizes arithmetic circuit satisfiability. By instantiating with different commitment schemes, we obtain several zkSNARKs where the verifier’s costs and the proof size range from O(log2 N) to O(N) Oleft( {sqrt N } right) depending on the underlying polynomial commitment schemes when applied to an N-gate arithmetic circuit. All these schemes do not require a trusted setup. It is plausibly post-quantum secure when instantiated with a secure collision-resistant hash function. We report on experiments for evaluating the performance of our proposed system. For instance, for verifying a SHA-256 preimage (less than 23k AND gates) in zero-knowledge with 128 bits security, the proof size is less than 150kB and the verification time is less than 11ms, both competitive to existing systems.
摘要我们为秩-1约束满足(RICS)提出了一种新的零知识简洁知识论证(zkSNARK)方案,这是一种广泛部署的NP完全语言,推广了算术电路的可满足性。通过用不同的承诺方案进行实例化,我们获得了几个zksNARK,其中验证器的成本和证明大小范围从O(log2N)到O(N)Oleft({sqrt N}right),这取决于应用于N门运算电路时的底层多项式承诺方案。所有这些方案都不需要可信的设置。当使用安全的抗冲突哈希函数实例化时,它似乎是后量子安全的。我们报告了评估我们提出的系统性能的实验。例如,对于以128位安全性在零知识中验证SHA-256预图像(小于23k与门),证明大小小于150kB,验证时间小于11ms,两者都与现有系统具有竞争力。
{"title":"Polaris: Transparent Succinct Zero-Knowledge Arguments for R1CS with Efficient Verifier","authors":"Shihui Fu, G. Gong","doi":"10.2478/popets-2022-0027","DOIUrl":"https://doi.org/10.2478/popets-2022-0027","url":null,"abstract":"Abstract We present a new zero-knowledge succinct argument of knowledge (zkSNARK) scheme for Rank-1 Constraint Satisfaction (RICS), a widely deployed NP-complete language that generalizes arithmetic circuit satisfiability. By instantiating with different commitment schemes, we obtain several zkSNARKs where the verifier’s costs and the proof size range from O(log2 N) to O(N) Oleft( {sqrt N } right) depending on the underlying polynomial commitment schemes when applied to an N-gate arithmetic circuit. All these schemes do not require a trusted setup. It is plausibly post-quantum secure when instantiated with a secure collision-resistant hash function. We report on experiments for evaluating the performance of our proposed system. For instance, for verifying a SHA-256 preimage (less than 23k AND gates) in zero-knowledge with 128 bits security, the proof size is less than 150kB and the verification time is less than 11ms, both competitive to existing systems.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41683580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MLEFlow: Learning from History to Improve Load Balancing in Tor MLEFlow:从历史中学习以改进Tor中的负载平衡
Hussein Darir, Hussein Sibai, Chin-Yu Cheng, N. Borisov, G. Dullerud, S. Mitra
Abstract Tor has millions of daily users seeking privacy while browsing the Internet. It has thousands of relays to route users’ packets while anonymizing their sources and destinations. Users choose relays to forward their traffic according to probability distributions published by the Tor authorities. The authorities generate these probability distributions based on estimates of the capacities of the relays. They compute these estimates based on the bandwidths of probes sent to the relays. These estimates are necessary for better load balancing. Unfortunately, current methods fall short of providing accurate estimates leaving the network underutilized and its capacities unfairly distributed between the users’ paths. We present MLEFlow, a maximum likelihood approach for estimating relay capacities for optimal load balancing in Tor. We show that MLEFlow generalizes a version of Tor capacity estimation, TorFlow-P, by making better use of measurement history. We prove that the mean of our estimate converges to a small interval around the actual capacities, while the variance converges to zero. We present two versions of MLEFlow: MLEFlow-CF, a closed-form approximation of the MLE and MLEFlow-Q, a discretization and iterative approximation of the MLE which can account for noisy observations. We demonstrate the practical benefits of MLEFlow by simulating it using a flow-based Python simulator of a full Tor network and packet-based Shadow simulation of a scaled down version. In our simulations MLEFlow provides significantly more accurate estimates, which result in improved user performance, with median download speeds increasing by 30%.
摘要Tor每天都有数百万用户在浏览互联网时寻求隐私。它有数千个中继来路由用户的数据包,同时匿名他们的来源和目的地。用户根据Tor当局发布的概率分布选择中继转发流量。当局根据对继电器容量的估计生成这些概率分布。他们根据发送到中继器的探测带宽来计算这些估计值。这些估计对于更好的负载平衡是必要的。不幸的是,目前的方法无法提供准确的估计,导致网络未得到充分利用,其容量在用户路径之间不公平地分配。我们提出了MLEFlow,这是一种用于估计Tor中最优负载平衡的继电器容量的最大似然方法。我们表明,MLEFlow通过更好地利用测量历史,推广了Tor容量估计的一个版本TorFlow-P。我们证明了我们估计的平均值收敛到实际容量附近的一个小区间,而方差收敛到零。我们提出了MLEFlow的两个版本:MLEFlow CF,MLE和MLEFlow-Q的闭合形式近似,MLE的离散化和迭代近似,可以考虑噪声观测。我们通过使用全Tor网络的基于流的Python模拟器和缩小版本的基于包的Shadow模拟来模拟MLEFlow,展示了MLEFlow的实际好处。在我们的模拟中,MLEFlow提供了更准确的估计,从而提高了用户性能,中值下载速度提高了30%。
{"title":"MLEFlow: Learning from History to Improve Load Balancing in Tor","authors":"Hussein Darir, Hussein Sibai, Chin-Yu Cheng, N. Borisov, G. Dullerud, S. Mitra","doi":"10.2478/popets-2022-0005","DOIUrl":"https://doi.org/10.2478/popets-2022-0005","url":null,"abstract":"Abstract Tor has millions of daily users seeking privacy while browsing the Internet. It has thousands of relays to route users’ packets while anonymizing their sources and destinations. Users choose relays to forward their traffic according to probability distributions published by the Tor authorities. The authorities generate these probability distributions based on estimates of the capacities of the relays. They compute these estimates based on the bandwidths of probes sent to the relays. These estimates are necessary for better load balancing. Unfortunately, current methods fall short of providing accurate estimates leaving the network underutilized and its capacities unfairly distributed between the users’ paths. We present MLEFlow, a maximum likelihood approach for estimating relay capacities for optimal load balancing in Tor. We show that MLEFlow generalizes a version of Tor capacity estimation, TorFlow-P, by making better use of measurement history. We prove that the mean of our estimate converges to a small interval around the actual capacities, while the variance converges to zero. We present two versions of MLEFlow: MLEFlow-CF, a closed-form approximation of the MLE and MLEFlow-Q, a discretization and iterative approximation of the MLE which can account for noisy observations. We demonstrate the practical benefits of MLEFlow by simulating it using a flow-based Python simulator of a full Tor network and packet-based Shadow simulation of a scaled down version. In our simulations MLEFlow provides significantly more accurate estimates, which result in improved user performance, with median download speeds increasing by 30%.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41710793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Polymath: Low-Latency MPC via Secure Polynomial Evaluations and Its Applications Polymath:基于安全多项式评估的低延迟MPC及其应用
Donghang Lu, Albert Yu, Aniket Kate, H. K. Maji
Abstract While the practicality of secure multi-party computation (MPC) has been extensively analyzed and improved over the past decade, we are hitting the limits of efficiency with the traditional approaches of representing the computed functionalities as generic arithmetic or Boolean circuits. This work follows the design principle of identifying and constructing fast and provably-secure MPC protocols to evaluate useful high-level algebraic abstractions; thus, improving the efficiency of all applications relying on them. We present Polymath, a constant-round secure computation protocol suite for the secure evaluation of (multi-variate) polynomials of scalars and matrices, functionalities essential to numerous data-processing applications. Using precise natural precomputation and high-degree of parallelism prevalent in the modern computing environments, Polymath can make latency of secure polynomial evaluations of scalars and matrices independent of polynomial degree and matrix dimensions. We implement our protocols over the HoneyBadgerMPC library and apply it to two prominent secure computation tasks: privacy-preserving evaluation of decision trees and privacy-preserving evaluation of Markov processes. For the decision tree evaluation problem, we demonstrate the feasibility of evaluating high-depth decision tree models in a general n-party setting. For the Markov process application, we demonstrate that Poly-math can compute large powers of transition matrices with better online time and less communication.
虽然安全多方计算(MPC)的实用性在过去十年中得到了广泛的分析和改进,但我们使用传统的方法将计算功能表示为通用算术或布尔电路,从而达到了效率的极限。本工作遵循识别和构建快速且可证明安全的MPC协议的设计原则,以评估有用的高级代数抽象;从而提高所有依赖于它们的应用程序的效率。我们提出了Polymath,一个恒轮安全计算协议套件,用于安全评估标量和矩阵的(多变量)多项式,对许多数据处理应用至关重要的功能。利用精确的自然预计算和现代计算环境中普遍存在的高度并行性,Polymath可以使标量和矩阵的安全多项式计算延迟与多项式度和矩阵维数无关。我们在HoneyBadgerMPC库上实现了我们的协议,并将其应用于两个重要的安全计算任务:决策树的隐私保护评估和马尔可夫过程的隐私保护评估。对于决策树评估问题,我们证明了在一般n方设置下评估高深度决策树模型的可行性。对于马尔可夫过程的应用,我们证明了Poly-math可以以更好的在线时间和更少的通信计算大幂次的转移矩阵。
{"title":"Polymath: Low-Latency MPC via Secure Polynomial Evaluations and Its Applications","authors":"Donghang Lu, Albert Yu, Aniket Kate, H. K. Maji","doi":"10.2478/popets-2022-0020","DOIUrl":"https://doi.org/10.2478/popets-2022-0020","url":null,"abstract":"Abstract While the practicality of secure multi-party computation (MPC) has been extensively analyzed and improved over the past decade, we are hitting the limits of efficiency with the traditional approaches of representing the computed functionalities as generic arithmetic or Boolean circuits. This work follows the design principle of identifying and constructing fast and provably-secure MPC protocols to evaluate useful high-level algebraic abstractions; thus, improving the efficiency of all applications relying on them. We present Polymath, a constant-round secure computation protocol suite for the secure evaluation of (multi-variate) polynomials of scalars and matrices, functionalities essential to numerous data-processing applications. Using precise natural precomputation and high-degree of parallelism prevalent in the modern computing environments, Polymath can make latency of secure polynomial evaluations of scalars and matrices independent of polynomial degree and matrix dimensions. We implement our protocols over the HoneyBadgerMPC library and apply it to two prominent secure computation tasks: privacy-preserving evaluation of decision trees and privacy-preserving evaluation of Markov processes. For the decision tree evaluation problem, we demonstrate the feasibility of evaluating high-depth decision tree models in a general n-party setting. For the Markov process application, we demonstrate that Poly-math can compute large powers of transition matrices with better online time and less communication.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41468067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
If You Like Me, Please Don’t “Like” Me: Inferring Vendor Bitcoin Addresses From Positive Reviews 如果你喜欢我,请不要“喜欢”我:从正面评论推断供应商比特币地址
Jochen Schäfer, Christian Müller, Frederik Armknecht
Abstract Bitcoin and similar cryptocurrencies are becoming increasingly popular as a payment method in both legitimate and illegitimate online markets. Such markets usually deploy a review system that allows users to rate their purchases and help others to determine reliable vendors. Consequently, vendors are interested into accumulating as many positive reviews (likes) as possible and to make these public. However, we present an attack that exploits these publicly available information to identify cryptocurrency addresses potentially belonging to vendors. In its basic variant, it focuses on vendors that reuse their addresses. We also show an extended variant that copes with the case that addresses are used only once. We demonstrate the applicability of the attack by modeling Bitcoin transactions based on vendor reviews of two separate darknet markets and retrieve matching transactions from the blockchain. By doing so, we can identify Bitcoin addresses likely belonging to darknet market vendors.
摘要比特币和类似的加密货币作为一种支付方式在合法和非法的在线市场上越来越受欢迎。此类市场通常会部署一个审查系统,允许用户对其购买进行评分,并帮助其他人确定可靠的供应商。因此,供应商有兴趣积累尽可能多的正面评价(点赞),并将其公开。然而,我们提出了一种利用这些公开信息来识别可能属于供应商的加密货币地址的攻击。在其基本变体中,它侧重于重用其地址的供应商。我们还展示了一个扩展变体,它可以处理地址只使用一次的情况。我们通过基于两个独立暗网市场的供应商审查对比特币交易进行建模,并从区块链中检索匹配的交易,来证明攻击的适用性。通过这样做,我们可以识别可能属于暗网市场供应商的比特币地址。
{"title":"If You Like Me, Please Don’t “Like” Me: Inferring Vendor Bitcoin Addresses From Positive Reviews","authors":"Jochen Schäfer, Christian Müller, Frederik Armknecht","doi":"10.2478/popets-2022-0022","DOIUrl":"https://doi.org/10.2478/popets-2022-0022","url":null,"abstract":"Abstract Bitcoin and similar cryptocurrencies are becoming increasingly popular as a payment method in both legitimate and illegitimate online markets. Such markets usually deploy a review system that allows users to rate their purchases and help others to determine reliable vendors. Consequently, vendors are interested into accumulating as many positive reviews (likes) as possible and to make these public. However, we present an attack that exploits these publicly available information to identify cryptocurrency addresses potentially belonging to vendors. In its basic variant, it focuses on vendors that reuse their addresses. We also show an extended variant that copes with the case that addresses are used only once. We demonstrate the applicability of the attack by modeling Bitcoin transactions based on vendor reviews of two separate darknet markets and retrieve matching transactions from the blockchain. By doing so, we can identify Bitcoin addresses likely belonging to darknet market vendors.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49008510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zen and the art of model adaptation: Low-utility-cost attack mitigations in collaborative machine learning 禅宗和模型适应的艺术:协作机器学习中的低效用成本攻击缓解
Dmitrii Usynin, D. Rueckert, Jonathan Passerat-Palmbach, Georgios Kaissis
Abstract In this study, we aim to bridge the gap between the theoretical understanding of attacks against collaborative machine learning workflows and their practical ramifications by considering the effects of model architecture, learning setting and hyperparameters on the resilience against attacks. We refer to such mitigations as model adaptation. Through extensive experimentation on both, benchmark and real-life datasets, we establish a more practical threat model for collaborative learning scenarios. In particular, we evaluate the impact of model adaptation by implementing a range of attacks belonging to the broader categories of model inversion and membership inference. Our experiments yield two noteworthy outcomes: they demonstrate the difficulty of actually conducting successful attacks under realistic settings when model adaptation is employed and they highlight the challenge inherent in successfully combining model adaptation and formal privacy-preserving techniques to retain the optimal balance between model utility and attack resilience.
在本研究中,我们的目标是通过考虑模型架构、学习设置和超参数对攻击弹性的影响,弥合对协作机器学习工作流攻击的理论理解与其实际后果之间的差距。我们把这种减缓称为模式适应。通过对基准和现实数据集的广泛实验,我们为协作学习场景建立了一个更实用的威胁模型。特别是,我们通过实施属于更广泛的模型反演和隶属推理类别的一系列攻击来评估模型适应的影响。我们的实验产生了两个值得注意的结果:它们证明了在使用模型适应时,在现实设置下实际进行成功攻击的困难,并且它们突出了成功结合模型适应和正式隐私保护技术以保持模型效用和攻击弹性之间的最佳平衡所固有的挑战。
{"title":"Zen and the art of model adaptation: Low-utility-cost attack mitigations in collaborative machine learning","authors":"Dmitrii Usynin, D. Rueckert, Jonathan Passerat-Palmbach, Georgios Kaissis","doi":"10.2478/popets-2022-0014","DOIUrl":"https://doi.org/10.2478/popets-2022-0014","url":null,"abstract":"Abstract In this study, we aim to bridge the gap between the theoretical understanding of attacks against collaborative machine learning workflows and their practical ramifications by considering the effects of model architecture, learning setting and hyperparameters on the resilience against attacks. We refer to such mitigations as model adaptation. Through extensive experimentation on both, benchmark and real-life datasets, we establish a more practical threat model for collaborative learning scenarios. In particular, we evaluate the impact of model adaptation by implementing a range of attacks belonging to the broader categories of model inversion and membership inference. Our experiments yield two noteworthy outcomes: they demonstrate the difficulty of actually conducting successful attacks under realistic settings when model adaptation is employed and they highlight the challenge inherent in successfully combining model adaptation and formal privacy-preserving techniques to retain the optimal balance between model utility and attack resilience.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42376043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Privacy-Preserving High-dimensional Data Collection with Federated Generative Autoencoder 基于联邦生成自编码器的保密性高维数据采集
Xue Jiang, Xuebing Zhou, Jens Grossklags
Abstract Business intelligence and AI services often involve the collection of copious amounts of multidimensional personal data. Since these data usually contain sensitive information of individuals, the direct collection can lead to privacy violations. Local differential privacy (LDP) is currently considered a state-ofthe-art solution for privacy-preserving data collection. However, existing LDP algorithms are not applicable to high-dimensional data; not only because of the increase in computation and communication cost, but also poor data utility. In this paper, we aim at addressing the curse-of-dimensionality problem in LDP-based high-dimensional data collection. Based on the idea of machine learning and data synthesis, we propose DP-Fed-Wae, an efficient privacy-preserving framework for collecting high-dimensional categorical data. With the combination of a generative autoencoder, federated learning, and differential privacy, our framework is capable of privately learning the statistical distributions of local data and generating high utility synthetic data on the server side without revealing users’ private information. We have evaluated the framework in terms of data utility and privacy protection on a number of real-world datasets containing 68–124 classification attributes. We show that our framework outperforms the LDP-based baseline algorithms in capturing joint distributions and correlations of attributes and generating high-utility synthetic data. With a local privacy guarantee ∈ = 8, the machine learning models trained with the synthetic data generated by the baseline algorithm cause an accuracy loss of 10% ~ 30%, whereas the accuracy loss is significantly reduced to less than 3% and at best even less than 1% with our framework. Extensive experimental results demonstrate the capability and efficiency of our framework in synthesizing high-dimensional data while striking a satisfactory utility-privacy balance.
摘要商业智能和人工智能服务通常涉及收集大量多维个人数据。由于这些数据通常包含个人的敏感信息,直接收集可能会导致侵犯隐私。局部差分隐私(LDP)目前被认为是一种最先进的隐私保护数据收集解决方案。然而,现有的LDP算法不适用于高维数据;这不仅是因为计算和通信成本的增加,而且数据的实用性较差。本文旨在解决基于LDP的高维数据采集中的维数诅咒问题。基于机器学习和数据合成的思想,我们提出了一种高效的隐私保护框架DP-Fede-Wae,用于收集高维分类数据。通过将生成自动编码器、联合学习和差分隐私相结合,我们的框架能够私下学习本地数据的统计分布,并在服务器端生成高效用的合成数据,而不会泄露用户的私人信息。我们在包含68–124个分类属性的多个真实世界数据集上,从数据实用性和隐私保护方面对该框架进行了评估。我们表明,我们的框架在捕获属性的联合分布和相关性以及生成高效用合成数据方面优于基于LDP的基线算法。在局部隐私保证∈=8的情况下,使用基线算法生成的合成数据训练的机器学习模型会导致10%~30%的准确度损失,而使用我们的框架,准确度损失显著降低到3%以下,最多甚至低于1%。大量的实验结果证明了我们的框架在合成高维数据方面的能力和效率,同时达到了令人满意的效用-隐私平衡。
{"title":"Privacy-Preserving High-dimensional Data Collection with Federated Generative Autoencoder","authors":"Xue Jiang, Xuebing Zhou, Jens Grossklags","doi":"10.2478/popets-2022-0024","DOIUrl":"https://doi.org/10.2478/popets-2022-0024","url":null,"abstract":"Abstract Business intelligence and AI services often involve the collection of copious amounts of multidimensional personal data. Since these data usually contain sensitive information of individuals, the direct collection can lead to privacy violations. Local differential privacy (LDP) is currently considered a state-ofthe-art solution for privacy-preserving data collection. However, existing LDP algorithms are not applicable to high-dimensional data; not only because of the increase in computation and communication cost, but also poor data utility. In this paper, we aim at addressing the curse-of-dimensionality problem in LDP-based high-dimensional data collection. Based on the idea of machine learning and data synthesis, we propose DP-Fed-Wae, an efficient privacy-preserving framework for collecting high-dimensional categorical data. With the combination of a generative autoencoder, federated learning, and differential privacy, our framework is capable of privately learning the statistical distributions of local data and generating high utility synthetic data on the server side without revealing users’ private information. We have evaluated the framework in terms of data utility and privacy protection on a number of real-world datasets containing 68–124 classification attributes. We show that our framework outperforms the LDP-based baseline algorithms in capturing joint distributions and correlations of attributes and generating high-utility synthetic data. With a local privacy guarantee ∈ = 8, the machine learning models trained with the synthetic data generated by the baseline algorithm cause an accuracy loss of 10% ~ 30%, whereas the accuracy loss is significantly reduced to less than 3% and at best even less than 1% with our framework. Extensive experimental results demonstrate the capability and efficiency of our framework in synthesizing high-dimensional data while striking a satisfactory utility-privacy balance.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46243940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1