首页 > 最新文献

ACM Transactions on Privacy and Security最新文献

英文 中文
Paralinguistic Privacy Protection at the Edge 边缘的副语言隐私保护
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-11-03 DOI: https://dl.acm.org/doi/10.1145/3570161
Ranya Aloufi, Hamed Haddadi, David Boyle

Voice user interfaces and digital assistants are rapidly entering our lives and becoming singular touch points spanning our devices. These always-on services capture and transmit our audio data to powerful cloud services for further processing and subsequent actions. Our voices and raw audio signals collected through these devices contain a host of sensitive paralinguistic information that is transmitted to service providers regardless of deliberate or false triggers. As our emotional patterns and sensitive attributes like our identity, gender, well-being, are easily inferred using deep acoustic models, we encounter a new generation of privacy risks by using these services. One approach to mitigate the risk of paralinguistic-based privacy breaches is to exploit a combination of cloud-based processing with privacy-preserving, on-device paralinguistic information learning and filtering before transmitting voice data.

In this paper we introduce EDGY, a configurable, lightweight, disentangled representation learning framework that transforms and filters high-dimensional voice data to identify and contain sensitive attributes at the edge prior to offloading to the cloud. We evaluate EDGY’s on-device performance and explore optimization techniques, including model quantization and knowledge distillation, to enable private, accurate and efficient representation learning on resource-constrained devices. Our results show that EDGY runs in tens of milliseconds with 0.2% relative improvement in ‘zero-shot’ ABX score or minimal performance penalties of approximately 5.95% word error rate (WER) in learning linguistic representations from raw voice signals, using a CPU and a single-core ARM processor without specialized hardware.

语音用户界面和数字助理正在迅速进入我们的生活,并成为跨越我们设备的单一接触点。这些始终在线的服务捕获并将我们的音频数据传输到强大的云服务,以进行进一步处理和后续操作。通过这些设备收集的我们的声音和原始音频信号包含大量敏感的副语言信息,无论是否有意或虚假触发,这些信息都会传输给服务提供商。由于我们的情感模式和敏感属性,如我们的身份、性别、幸福感,很容易通过深层声学模型推断出来,我们在使用这些服务时遇到了新一代的隐私风险。减轻基于副语言的隐私泄露风险的一种方法是在传输语音数据之前,将基于云的处理与隐私保护、设备上的副语言信息学习和过滤相结合。在本文中,我们介绍了EDGY,这是一个可配置的、轻量级的、解纠缠的表示学习框架,它可以转换和过滤高维语音数据,以便在卸载到云之前识别和包含边缘的敏感属性。我们评估了EDGY在设备上的性能,并探索了优化技术,包括模型量化和知识蒸馏,以便在资源受限的设备上实现私有、准确和高效的表示学习。我们的结果表明,EDGY在几十毫秒内运行,在“零射击”ABX分数方面相对提高0.2%,或者在使用CPU和单核ARM处理器而没有专门硬件的情况下,从原始语音信号中学习语言表示时,单词错误率(WER)的最小性能损失约为5.95%。
{"title":"Paralinguistic Privacy Protection at the Edge","authors":"Ranya Aloufi, Hamed Haddadi, David Boyle","doi":"https://dl.acm.org/doi/10.1145/3570161","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3570161","url":null,"abstract":"<p>Voice user interfaces and digital assistants are rapidly entering our lives and becoming singular touch points spanning our devices. These <i>always-on</i> services capture and transmit our audio data to powerful cloud services for further processing and subsequent actions. Our voices and raw audio signals collected through these devices contain a host of sensitive paralinguistic information that is transmitted to service providers regardless of deliberate or false triggers. As our emotional patterns and sensitive attributes like our identity, gender, well-being, are easily inferred using deep acoustic models, we encounter a new generation of privacy risks by using these services. One approach to mitigate the risk of paralinguistic-based privacy breaches is to exploit a combination of cloud-based processing with privacy-preserving, on-device paralinguistic information learning and filtering before transmitting voice data. </p><p>In this paper we introduce <i>EDGY</i>, a configurable, lightweight, disentangled representation learning framework that transforms and filters high-dimensional voice data to identify and contain sensitive attributes at the edge prior to offloading to the cloud. We evaluate EDGY’s on-device performance and explore optimization techniques, including model quantization and knowledge distillation, to enable private, accurate and efficient representation learning on resource-constrained devices. Our results show that EDGY runs in tens of milliseconds with 0.2% relative improvement in ‘zero-shot’ ABX score or minimal performance penalties of approximately 5.95% word error rate (WER) in learning linguistic representations from raw voice signals, using a CPU and a single-core ARM processor without specialized hardware.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pareto-optimal Defenses for the Web Infrastructure: Theory and Practice 网络基础设施的帕累托最优防御:理论与实践
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-10-13 DOI: 10.1145/3567595
Giorgio Di Tizio, Patrick Speicher, Milivoj Simeonovski, M. Backes, Ben Stock, R. Künnemann
The integrity of the content a user is exposed to when browsing the web relies on a plethora of non-web technologies and an infrastructure of interdependent hosts, communication technologies, and trust relations. Incidents like the Chinese Great Cannon or the MyEtherWallet attack make it painfully clear: the security of end users hinges on the security of the surrounding infrastructure: routing, DNS, content delivery, and the PKI. There are many competing, but isolated proposals to increase security, from the network up to the application layer. So far, researchers have focused on analyzing attacks and defenses on specific layers. We still lack an evaluation of how, given the status quo of the web, these proposals can be combined, how effective they are, and at what cost the increase of security comes. In this work, we propose a graph-based analysis based on Stackelberg planning that considers a rich attacker model and a multitude of proposals from IPsec to DNSSEC and SRI. Our threat model considers the security of billions of users against attackers ranging from small hacker groups to nation-state actors. Analyzing the infrastructure of the Top 5k Alexa domains, we discover that the security mechanisms currently deployed are ineffective and that some infrastructure providers have a comparable threat potential to nations. We find a considerable increase of security (up to 13% protected web visits) is possible at a relatively modest cost, due to the effectiveness of mitigations at the application and transport layer, which dominate expensive infrastructure enhancements such as DNSSEC and IPsec.
用户在浏览网页时所接触到的内容的完整性依赖于大量的非网页技术和由相互依赖的主机、通信技术和信任关系组成的基础设施。像中国大炮或MyEtherWallet攻击这样的事件痛苦地表明:最终用户的安全性取决于周围基础设施的安全性:路由,DNS,内容交付和PKI。从网络到应用层,有许多相互竞争但相互孤立的提高安全性的建议。到目前为止,研究人员一直专注于分析特定层的攻击和防御。鉴于网络的现状,我们仍然缺乏对如何将这些建议结合起来的评估,它们的有效性如何,以及安全性的提高需要付出多大的代价。在这项工作中,我们提出了一种基于Stackelberg规划的基于图的分析,该分析考虑了丰富的攻击者模型以及从IPsec到DNSSEC和SRI的众多建议。我们的威胁模型考虑了数十亿用户对从小型黑客组织到民族国家行为者的攻击的安全性。分析前5k Alexa域名的基础设施,我们发现目前部署的安全机制是无效的,一些基础设施提供商对国家具有相当的威胁潜力。我们发现,由于应用程序和传输层的缓解措施的有效性,可以以相对适度的成本大幅提高安全性(受保护的web访问高达13%),这主要是昂贵的基础设施增强,如DNSSEC和IPsec。
{"title":"Pareto-optimal Defenses for the Web Infrastructure: Theory and Practice","authors":"Giorgio Di Tizio, Patrick Speicher, Milivoj Simeonovski, M. Backes, Ben Stock, R. Künnemann","doi":"10.1145/3567595","DOIUrl":"https://doi.org/10.1145/3567595","url":null,"abstract":"The integrity of the content a user is exposed to when browsing the web relies on a plethora of non-web technologies and an infrastructure of interdependent hosts, communication technologies, and trust relations. Incidents like the Chinese Great Cannon or the MyEtherWallet attack make it painfully clear: the security of end users hinges on the security of the surrounding infrastructure: routing, DNS, content delivery, and the PKI. There are many competing, but isolated proposals to increase security, from the network up to the application layer. So far, researchers have focused on analyzing attacks and defenses on specific layers. We still lack an evaluation of how, given the status quo of the web, these proposals can be combined, how effective they are, and at what cost the increase of security comes. In this work, we propose a graph-based analysis based on Stackelberg planning that considers a rich attacker model and a multitude of proposals from IPsec to DNSSEC and SRI. Our threat model considers the security of billions of users against attackers ranging from small hacker groups to nation-state actors. Analyzing the infrastructure of the Top 5k Alexa domains, we discover that the security mechanisms currently deployed are ineffective and that some infrastructure providers have a comparable threat potential to nations. We find a considerable increase of security (up to 13% protected web visits) is possible at a relatively modest cost, due to the effectiveness of mitigations at the application and transport layer, which dominate expensive infrastructure enhancements such as DNSSEC and IPsec.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47320596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pareto-Optimal Defenses for the Web Infrastructure: Theory and Practice 网络基础设施的帕累托最优防御:理论与实践
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-10-13 DOI: https://dl.acm.org/doi/10.1145/3567595
Giorgio Di Tizio, Patrick Speicher, Milivoj Simeonovski, Michael Backes, Ben Stock, Robert Künnemann

The integrity of the content a user is exposed to when browsing the web relies on a plethora of non-web technologies and an infrastructure of interdependent hosts, communication technologies, and trust relations. Incidents like the Chinese Great Cannon or the MyEtherWallet attack make it painfully clear: the security of end users hinges on the security of the surrounding infrastructure: routing, DNS, content delivery, and the PKI. There are many competing, but isolated proposals to increase security, from the network up to the application layer. So far, researchers have focus on analyzing attacks and defenses on specific layers. We still lack an evaluation of how, given the status quo of the web, these proposals can be combined, how effective they are, and at what cost the increase of security comes. In this work, we propose a graph-based analysis based on Stackelberg planning that considers a rich attacker model and a multitude of proposals from IPsec to DNSSEC and SRI. Our threat model considers the security of billions of users against attackers ranging from small hacker groups to nation-state actors. Analyzing the infrastructure of the Top 5k Alexa domains, we discover that the security mechanisms currently deployed are ineffective and that some infrastructure providers have a comparable threat potential to nations. We find a considerable increase of security (up to 13% protected web visits) is possible at relatively modest cost, due to the effectiveness of mitigations at the application and transport layer, which dominate expensive infrastructure enhancements such as DNSSEC and IPsec.

用户在浏览网页时所接触到的内容的完整性依赖于大量的非网页技术和由相互依赖的主机、通信技术和信任关系组成的基础设施。像中国大炮或MyEtherWallet攻击这样的事件痛苦地表明:最终用户的安全性取决于周围基础设施的安全性:路由,DNS,内容交付和PKI。从网络到应用层,有许多相互竞争但相互孤立的提高安全性的建议。到目前为止,研究人员主要集中在分析特定层的攻击和防御。鉴于网络的现状,我们仍然缺乏对如何将这些建议结合起来的评估,它们的有效性如何,以及安全性的提高需要付出多大的代价。在这项工作中,我们提出了一种基于Stackelberg规划的基于图的分析,该分析考虑了丰富的攻击者模型以及从IPsec到DNSSEC和SRI的众多建议。我们的威胁模型考虑了数十亿用户对从小型黑客组织到民族国家行为者的攻击的安全性。分析前5k Alexa域名的基础设施,我们发现目前部署的安全机制是无效的,一些基础设施提供商对国家具有相当的威胁潜力。我们发现,由于应用程序和传输层的缓解措施的有效性,可以以相对适度的成本大幅提高安全性(受保护的web访问高达13%),这主要是昂贵的基础设施增强,如DNSSEC和IPsec。
{"title":"Pareto-Optimal Defenses for the Web Infrastructure: Theory and Practice","authors":"Giorgio Di Tizio, Patrick Speicher, Milivoj Simeonovski, Michael Backes, Ben Stock, Robert Künnemann","doi":"https://dl.acm.org/doi/10.1145/3567595","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3567595","url":null,"abstract":"<p>The integrity of the content a user is exposed to when browsing the web relies on a plethora of non-web technologies and an infrastructure of interdependent hosts, communication technologies, and trust relations. Incidents like the Chinese Great Cannon or the MyEtherWallet attack make it painfully clear: the security of end users hinges on the security of the surrounding infrastructure: routing, DNS, content delivery, and the PKI. There are many competing, but isolated proposals to increase security, from the network up to the application layer. So far, researchers have focus on analyzing attacks and defenses on specific layers. We still lack an evaluation of how, given the status quo of the web, these proposals can be combined, how effective they are, and at what cost the increase of security comes. In this work, we propose a graph-based analysis based on Stackelberg planning that considers a rich attacker model and a multitude of proposals from IPsec to DNSSEC and SRI. Our threat model considers the security of billions of users against attackers ranging from small hacker groups to nation-state actors. Analyzing the infrastructure of the Top 5k Alexa domains, we discover that the security mechanisms currently deployed are ineffective and that some infrastructure providers have a comparable threat potential to nations. We find a considerable increase of security (up to 13% protected web visits) is possible at relatively modest cost, due to the effectiveness of mitigations at the application and transport layer, which dominate expensive infrastructure enhancements such as DNSSEC and IPsec.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise Environments 消费者和企业环境中遭遇恶意软件的系统性和系统性风险的比较
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-10-03 DOI: 10.1145/3565362
Savino Dambra, Leyla Bilge, D. Balzarotti
Malware is still a widespread problem, and it is used by malicious actors to routinely compromise the security of computer systems. Consumers typically rely on a single AV product to detect and block possible malware infections, while corporations often install multiple security products, activate several layers of defenses, and establish security policies among employees. However, if a better security posture should lower the risk of malware infections, then the actual extent to which this happens is still under debate by risk analysis experts. Moreover, the difference in risks encountered by consumers and enterprises has never been empirically studied by using real-world data. In fact, the mere use of third-party software, network services, and the interconnected nature of our society necessarily exposes both classes of users to undiversifiable risks: Independently from how careful users are and how well they manage their cyber hygiene, a portion of that risk would simply exist because of the fact of using a computer, sharing the same networks, and running the same software. In this work, we shed light on both systemic (i.e., diversifiable and dependent on the security posture) and systematic (i.e., undiversifiable and independent of the cyber hygiene) risk classes. Leveraging the telemetry data of a popular security company, we compare, in the first part of our study, the effects that different security measures have on malware encounter risks in consumer and enterprise environments. In the second part, we conduct exploratory research on systematic risk, investigate the quality of nine different indicators we were able to extract from our telemetry, and provide, for the first time, quantitative indicators of their predictive power. Our results show that even if consumers have a slightly lower encounter rate than enterprises (9.8% vs. 12.0%), the latter do considerably better when selecting machines with an increasingly higher uptime (89% vs. 53%). The two segments also diverge when we separately consider the presence of Adware and Potentially Unwanted Applications (PUA) and the generic samples detected through behavioral signatures: While consumers have an encounter rate for Adware and PUA that is 6 times higher than enterprise machines, those on average match behavioral signatures 2 times more frequently than the counterpart. We find, instead, similar trends when analyzing the age of encountered signatures, and the prevalence of different classes of traditional malware (such as Ransomware and Cryptominers). Finally, our findings show that the amount of time a host is active, the volume of files generated on the machine, the number and reputation of vendors of the installed applications, the host geographical location, and its recurrent infected state carry useful information as indicators of systematic risk of malware encounters. Activity days and hours have a higher influence in the risk of consumers, increasing the odds of encountering malw
恶意软件仍然是一个普遍存在的问题,恶意行为者经常利用它来危害计算机系统的安全。消费者通常依靠单一的AV产品来检测和阻止可能的恶意软件感染,而公司通常安装多个安全产品,激活多层防御,并在员工中建立安全策略。然而,如果更好的安全态势应该降低恶意软件感染的风险,那么这种情况发生的实际程度仍在风险分析专家的争论中。此外,消费者和企业遇到的风险差异从未通过使用真实世界的数据进行过实证研究。事实上,仅仅使用第三方软件、网络服务和我们社会的互联性质就必然会使这两类用户面临不可逆转的风险:与用户的谨慎程度和他们对网络卫生的管理程度无关,部分风险的存在只是因为使用计算机、共享相同的网络,并运行相同的软件。在这项工作中,我们揭示了系统性(即多样性和依赖于安全态势)和系统性(如不可逆性和独立于网络卫生)风险类别。在研究的第一部分,我们利用一家流行安全公司的遥测数据,比较了不同安全措施对消费者和企业环境中恶意软件遭遇风险的影响。在第二部分中,我们对系统风险进行了探索性研究,调查了我们能够从遥测中提取的九个不同指标的质量,并首次提供了它们预测能力的定量指标。我们的研究结果表明,即使消费者的遭遇率略低于企业(9.8%对12.0%),后者在选择正常运行时间越来越高的机器时也会做得更好(89%对53%)。当我们分别考虑广告软件和潜在不需要的应用程序(PUA)的存在以及通过行为签名检测到的一般样本时,这两个部分也会出现分歧:虽然消费者对广告软件和PUA的遭遇率是企业机器的6倍,但这些人平均匹配行为签名的频率是同类机器的2倍。相反,我们在分析遇到的签名的年龄和不同类型的传统恶意软件(如勒索软件和加密矿工)的流行率时发现了类似的趋势。最后,我们的研究结果表明,主机处于活动状态的时间、机器上生成的文件量、安装的应用程序的供应商数量和声誉、主机的地理位置及其反复感染的状态都提供了有用的信息,作为恶意软件遭遇系统风险的指标。活动日和时间对消费者的风险影响更大,遇到恶意软件的几率分别增加了4.51和2.65倍。此外,我们衡量主机上生成的文件量是否代表了一个可靠的指标,尤其是在考虑Adware时。我们进一步报告说,对于那些过去已经报告过这种签名的机器来说,遇到蠕虫和广告软件的可能性要高得多(在消费者和企业中平均为8次)。
{"title":"A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise Environments","authors":"Savino Dambra, Leyla Bilge, D. Balzarotti","doi":"10.1145/3565362","DOIUrl":"https://doi.org/10.1145/3565362","url":null,"abstract":"Malware is still a widespread problem, and it is used by malicious actors to routinely compromise the security of computer systems. Consumers typically rely on a single AV product to detect and block possible malware infections, while corporations often install multiple security products, activate several layers of defenses, and establish security policies among employees. However, if a better security posture should lower the risk of malware infections, then the actual extent to which this happens is still under debate by risk analysis experts. Moreover, the difference in risks encountered by consumers and enterprises has never been empirically studied by using real-world data. In fact, the mere use of third-party software, network services, and the interconnected nature of our society necessarily exposes both classes of users to undiversifiable risks: Independently from how careful users are and how well they manage their cyber hygiene, a portion of that risk would simply exist because of the fact of using a computer, sharing the same networks, and running the same software. In this work, we shed light on both systemic (i.e., diversifiable and dependent on the security posture) and systematic (i.e., undiversifiable and independent of the cyber hygiene) risk classes. Leveraging the telemetry data of a popular security company, we compare, in the first part of our study, the effects that different security measures have on malware encounter risks in consumer and enterprise environments. In the second part, we conduct exploratory research on systematic risk, investigate the quality of nine different indicators we were able to extract from our telemetry, and provide, for the first time, quantitative indicators of their predictive power. Our results show that even if consumers have a slightly lower encounter rate than enterprises (9.8% vs. 12.0%), the latter do considerably better when selecting machines with an increasingly higher uptime (89% vs. 53%). The two segments also diverge when we separately consider the presence of Adware and Potentially Unwanted Applications (PUA) and the generic samples detected through behavioral signatures: While consumers have an encounter rate for Adware and PUA that is 6 times higher than enterprise machines, those on average match behavioral signatures 2 times more frequently than the counterpart. We find, instead, similar trends when analyzing the age of encountered signatures, and the prevalence of different classes of traditional malware (such as Ransomware and Cryptominers). Finally, our findings show that the amount of time a host is active, the volume of files generated on the machine, the number and reputation of vendors of the installed applications, the host geographical location, and its recurrent infected state carry useful information as indicators of systematic risk of malware encounters. Activity days and hours have a higher influence in the risk of consumers, increasing the odds of encountering malw","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46675772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise Environments 消费者和企业环境中遭遇恶意软件的系统性和系统性风险的比较
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-10-03 DOI: https://dl.acm.org/doi/10.1145/3565362
Savino Dambra, Leyla Bilge, Davide Balzarotti

Malware is still a widespread problem and it is used by malicious actors to routinely compromise the security of computer systems. Consumers typically rely on a single AV product to detect and block possible malware infections, while corporations often install multiple security products, activate several layers of defenses, and establish security policies among employees. However, if a better security posture should lower the risk of malware infections, the actual extent to which this happens is still under debate by risk analysis experts. Moreover, the difference in risks encountered by consumers and enterprises has never been empirically studied by using real-world data.

In fact, the mere use of third-party software, network services, and the interconnected nature of our society necessarily exposes both classes of users to undiversifiable risks: independently from how careful users are and how well they manage their cyber hygiene, a portion of that risk would simply exist because of the fact of using a computer, sharing the same networks, and running the same software.

In this work, we shed light on both systemic (i.e., diversifiable and dependent on the security posture) and systematic (i.e., undiversifiable and independent of the cyber hygiene) risk classes. Leveraging the telemetry data of a popular security company, we compare, in the first part of our study, the effects that different security measures have on malware encounter risks in consumer and enterprise environments. In the second part, we conduct exploratory research on systematic risk, investigate the quality of nine different indicators we were able to extract from our telemetry, and provide, for the first time, quantitative indicators of their predictive power.

Our results show that even if consumers have a slightly lower encounter rate than enterprises (9.8% vs 12.0%), the latter do considerably better when selecting machines with an increasingly higher uptime (89% vs 53%). The two segments also diverge when we separately consider the presence of Adware and Potentially Unwanted Applications (PUA), and the generic samples detected through behavioral signatures: while consumers have an encounter rate for Adware and PUA that is 6 times higher than enterprise machines, those on average match behavioral signatures two times more frequently than the counterpart. We find, instead, similar trends when analyzing the age of encountered signatures, and the prevalence of different classes of traditional malware (such as Ransomware and Cryptominers). Finally, our findings show that the amount of time a host is active, the volume of files generated on the machine, the number and reputation of vendors of the installed applications, the host geographical location and its recurrent infected state carry useful information as indicators of systematic risk of malware encounters. Activity days and hours have a higher influence in the risk of consumers, increasing the odds of

恶意软件仍然是一个普遍存在的问题,恶意行为者经常使用它来破坏计算机系统的安全。消费者通常依靠单一的反病毒产品来检测和阻止可能的恶意软件感染,而企业通常安装多个安全产品,激活多层防御,并在员工之间建立安全策略。然而,如果一个更好的安全状态可以降低恶意软件感染的风险,那么这种情况发生的实际程度仍在风险分析专家的争论中。此外,消费者和企业所面临的风险差异从未被使用真实世界的数据进行实证研究。事实上,仅仅是使用第三方软件、网络服务,以及我们社会相互联系的本质,就必然会使这两类用户面临不可分散的风险:与用户的谨慎程度和他们管理网络卫生的程度无关,其中一部分风险仅仅是因为使用计算机、共享相同的网络和运行相同的软件而存在。在这项工作中,我们阐明了系统性(即,多样化和依赖于安全态势)和系统性(即,不可多样化和独立于网络卫生)风险类别。利用一家知名安全公司的遥测数据,我们在研究的第一部分比较了不同安全措施在消费者和企业环境中对恶意软件遇到风险的影响。在第二部分中,我们对系统性风险进行了探索性研究,调查了我们从遥测中提取的9个不同指标的质量,并首次提供了它们预测能力的定量指标。我们的结果表明,即使消费者的遇到率略低于企业(9.8%对12.0%),后者在选择正常运行时间越来越长的机器时做得更好(89%对53%)。当我们分别考虑广告软件和潜在不受欢迎的应用程序(PUA)的存在,以及通过行为特征检测到的通用样本时,这两个部分也出现了分歧:虽然消费者对广告软件和PUA的遭遇率是企业机器的6倍,但平均匹配行为特征的频率是对等机器的两倍。相反,在分析遇到的签名的年龄以及不同类别的传统恶意软件(如勒索软件和加密矿工)的流行程度时,我们发现了类似的趋势。最后,我们的研究结果表明,主机的活动时间、机器上生成的文件量、已安装应用程序供应商的数量和声誉、主机的地理位置及其反复感染状态,都可以作为恶意软件遭遇系统风险的有用信息指标。活动天数和活动时间对消费者的风险影响较大,遭遇恶意软件的几率分别增加4.51倍和2.65倍。此外,我们测量主机上生成的文件量是一个可靠的指标,特别是在考虑Adware时。我们进一步报告说,对于那些过去已经报告过这种签名的机器,遇到蠕虫和广告软件的可能性要高得多(在消费者和企业中平均为8次)。
{"title":"A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise Environments","authors":"Savino Dambra, Leyla Bilge, Davide Balzarotti","doi":"https://dl.acm.org/doi/10.1145/3565362","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3565362","url":null,"abstract":"<p>Malware is still a widespread problem and it is used by malicious actors to routinely compromise the security of computer systems. Consumers typically rely on a single AV product to detect and block possible malware infections, while corporations often install multiple security products, activate several layers of defenses, and establish security policies among employees. However, if a better security posture should lower the risk of malware infections, the actual extent to which this happens is still under debate by risk analysis experts. Moreover, the difference in risks encountered by consumers and enterprises has never been empirically studied by using real-world data. </p><p>In fact, the mere use of third-party software, network services, and the interconnected nature of our society necessarily exposes both classes of users to undiversifiable risks: independently from how careful users are and how well they manage their cyber hygiene, a portion of that risk would simply exist because of the fact of using a computer, sharing the same networks, and running the same software. </p><p>In this work, we shed light on both systemic (i.e., diversifiable and dependent on the security posture) and systematic (i.e., undiversifiable and independent of the cyber hygiene) risk classes. Leveraging the telemetry data of a popular security company, we compare, in the first part of our study, the effects that different security measures have on malware encounter risks in consumer and enterprise environments. In the second part, we conduct exploratory research on systematic risk, investigate the quality of nine different indicators we were able to extract from our telemetry, and provide, for the first time, quantitative indicators of their predictive power. </p><p>Our results show that even if consumers have a slightly lower encounter rate than enterprises (9.8% vs 12.0%), the latter do considerably better when selecting machines with an increasingly higher uptime (89% vs 53%). The two segments also diverge when we separately consider the presence of Adware and Potentially Unwanted Applications (PUA), and the generic samples detected through behavioral signatures: while consumers have an encounter rate for Adware and PUA that is 6 times higher than enterprise machines, those on average match behavioral signatures two times more frequently than the counterpart. We find, instead, similar trends when analyzing the age of encountered signatures, and the prevalence of different classes of traditional malware (such as Ransomware and Cryptominers). Finally, our findings show that the amount of time a host is active, the volume of files generated on the machine, the number and reputation of vendors of the installed applications, the host geographical location and its recurrent infected state carry useful information as indicators of systematic risk of malware encounters. Activity days and hours have a higher influence in the risk of consumers, increasing the odds of","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy-preserving Decentralized Federated Learning over Time-varying Communication Graph 时变通信图上保护隐私的分散联邦学习
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-10-01 DOI: 10.1145/3591354
Yang Lu, Zhengxin Yu, N. Suri
Establishing how a set of learners can provide privacy-preserving federated learning in a fully decentralized (peer-to-peer, no coordinator) manner is an open problem. We propose the first privacy-preserving consensus-based algorithm for the distributed learners to achieve decentralized global model aggregation in an environment of high mobility, where participating learners and the communication graph between them may vary during the learning process. In particular, whenever the communication graph changes, the Metropolis-Hastings method [69] is applied to update the weighted adjacency matrix based on the current communication topology. In addition, the Shamir’s secret sharing (SSS) scheme [61] is integrated to facilitate privacy in reaching consensus of the global model. The article establishes the correctness and privacy properties of the proposed algorithm. The computational efficiency is evaluated by a simulation built on a federated learning framework with a real-world dataset.
建立一组学习器如何以完全分散(点对点,没有协调器)的方式提供保护隐私的联邦学习是一个开放的问题。本文提出了第一种基于共识的分布式学习算法,用于在高流动性环境下实现分布式全局模型聚合,该环境下参与学习的学习者及其之间的通信图可能在学习过程中发生变化。特别是,当通信图发生变化时,采用Metropolis-Hastings方法[69]根据当前通信拓扑更新加权邻接矩阵。此外,还集成了Shamir秘密共享(SSS)方案[61],以促进隐私达成全球模型的共识。本文建立了该算法的正确性和隐私性。通过建立在具有真实数据集的联邦学习框架上的仿真来评估计算效率。
{"title":"Privacy-preserving Decentralized Federated Learning over Time-varying Communication Graph","authors":"Yang Lu, Zhengxin Yu, N. Suri","doi":"10.1145/3591354","DOIUrl":"https://doi.org/10.1145/3591354","url":null,"abstract":"Establishing how a set of learners can provide privacy-preserving federated learning in a fully decentralized (peer-to-peer, no coordinator) manner is an open problem. We propose the first privacy-preserving consensus-based algorithm for the distributed learners to achieve decentralized global model aggregation in an environment of high mobility, where participating learners and the communication graph between them may vary during the learning process. In particular, whenever the communication graph changes, the Metropolis-Hastings method [69] is applied to update the weighted adjacency matrix based on the current communication topology. In addition, the Shamir’s secret sharing (SSS) scheme [61] is integrated to facilitate privacy in reaching consensus of the global model. The article establishes the correctness and privacy properties of the proposed algorithm. The computational efficiency is evaluated by a simulation built on a federated learning framework with a real-world dataset.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41272545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Solicitous Approach to Smart Contract Verification 智能合约验证的贴心方法
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-09-28 DOI: https://dl.acm.org/doi/10.1145/3564699
Rodrigo Otoni, Matteo Marescotti, Leonardo Alt, Patrick Eugster, Antti E. J. Hyvärinen, Natasha Sharygina

Smart contracts are tempting targets of attacks, since they often hold and manipulate significant financial assets, are immutable after deployment, and have publicly available source code, with assets estimated in the order of millions of US Dollars being lost in the past due to vulnerabilities. Formal verification is thus a necessity, but smart contracts challenge the existing highly efficient techniques routinely applied in the symbolic verification of software, due to specificities not present in general programming languages. A common feature of existing works in this area is the attempt to reuse off-the-shelf verification tools designed for general programming languages. This reuse can lead to inefficiency and potentially unsound results, since domain translation is required. In this paper we describe a carefully crafted approach that directly models the central aspects of smart contracts natively, going from the contract to its logical representation without intermediary steps. We use the expressive and highly automatable logic of constrained Horn clauses for modeling and we instantiate our approach to the Solidity language. A tool implementing our approach, called Solicitous, was developed and integrated into the SMTChecker module of the Solidity compiler solc. We evaluated our approach on an extensive benchmark set containing 22446 real-world smart contracts deployed on the Ethereum blockchain over a 27 months period. The results show that our approach is able to establish safety of significantly more contracts than comparable, publicly available verification tools, with an order of magnitude increase in the percentage of formally verified contracts.

智能合约是诱人的攻击目标,因为它们通常持有和操纵重要的金融资产,部署后不可变,并且具有公开可用的源代码,过去由于漏洞而损失的资产估计在数百万美元左右。因此,形式验证是必要的,但由于一般编程语言中不存在的特殊性,智能合约挑战了常规应用于软件符号验证的现有高效技术。该领域现有工作的一个共同特征是尝试重用为通用编程语言设计的现成验证工具。这种重用可能会导致效率低下和潜在的不可靠结果,因为需要进行域转换。在本文中,我们描述了一种精心设计的方法,该方法直接对智能合约的核心方面进行本地建模,从合约到其逻辑表示,无需中间步骤。我们使用富有表现力和高度自动化的约束Horn子句逻辑进行建模,并实例化了我们对solid语言的方法。我们开发了一个实现我们方法的工具,名为Solicitous,它被集成到Solidity编译器solc的SMTChecker模块中。我们在一个广泛的基准集上评估了我们的方法,该基准集包含了在27个月内部署在以太坊区块链上的22446个真实世界的智能合约。结果表明,我们的方法能够比可比的、公开可用的验证工具建立更多合同的安全性,并且在正式验证的合同的百分比上增加了一个数量级。
{"title":"A Solicitous Approach to Smart Contract Verification","authors":"Rodrigo Otoni, Matteo Marescotti, Leonardo Alt, Patrick Eugster, Antti E. J. Hyvärinen, Natasha Sharygina","doi":"https://dl.acm.org/doi/10.1145/3564699","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3564699","url":null,"abstract":"<p>Smart contracts are tempting targets of attacks, since they often hold and manipulate significant financial assets, are immutable after deployment, and have publicly available source code, with assets estimated in the order of millions of US Dollars being lost in the past due to vulnerabilities. Formal verification is thus a necessity, but smart contracts challenge the existing highly efficient techniques routinely applied in the symbolic verification of software, due to specificities not present in general programming languages. A common feature of existing works in this area is the attempt to reuse off-the-shelf verification tools designed for general programming languages. This reuse can lead to inefficiency and potentially unsound results, since domain translation is required. In this paper we describe a carefully crafted approach that directly models the central aspects of smart contracts natively, going from the contract to its logical representation without intermediary steps. We use the expressive and highly automatable logic of constrained Horn clauses for modeling and we instantiate our approach to the Solidity language. A tool implementing our approach, called Solicitous, was developed and integrated into the SMTChecker module of the Solidity compiler solc. We evaluated our approach on an extensive benchmark set containing 22446 real-world smart contracts deployed on the Ethereum blockchain over a 27 months period. The results show that our approach is able to establish safety of significantly more contracts than comparable, publicly available verification tools, with an order of magnitude increase in the percentage of formally verified contracts.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Solicitous Approach to Smart Contract Verification 智能合约验证的一种吸引人的方法
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-09-28 DOI: 10.1145/3564699
Rodrigo Otoni, Matteo Marescotti, Leonardo S. Alt, P. Eugster, A. Hyvärinen, N. Sharygina
Smart contracts are tempting targets of attacks, as they often hold and manipulate significant financial assets, are immutable after deployment, and have publicly available source code, with assets estimated in the order of millions of dollars being lost in the past due to vulnerabilities. Formal verification is thus a necessity, but smart contracts challenge the existing highly efficient techniques routinely applied in the symbolic verification of software, due to specificities not present in general programming languages. A common feature of existing works in this area is the attempt to reuse off-the-shelf verification tools designed for general programming languages. This reuse can lead to inefficiency and potentially unsound results, as domain translation is required. In this article, we describe a carefully crafted approach that directly models the central aspects of smart contracts natively, going from the contract to its logical representation without intermediary steps. We use the expressive and highly automatable logic of constrained Horn clauses for modeling and instantiate our approach to the Solidity language. A tool implementing our approach, called Solicitous, was developed and integrated into the SMTChecker module of the Solidity compiler solc. We evaluated our approach on an extensive benchmark set containing 22,446 real-world smart contracts deployed on the Ethereum blockchain over a 27-month period. The results show that our approach is able to establish safety of significantly more contracts than comparable, publicly available verification tools, with an order of magnitude increase in the percentage of formally verified contracts.
智能合约是诱人的攻击目标,因为它们通常持有和操纵重要的金融资产,部署后不可变,并且具有公开可用的源代码,过去由于漏洞估计损失了数百万美元的资产。因此,形式验证是必要的,但由于一般编程语言中不存在的特殊性,智能合约挑战了常规应用于软件符号验证的现有高效技术。该领域现有工作的一个共同特征是尝试重用为通用编程语言设计的现成验证工具。由于需要进行域转换,这种重用可能导致效率低下和潜在的不可靠结果。在本文中,我们描述了一种精心设计的方法,该方法直接对智能合约的核心方面进行本地建模,从合约到其逻辑表示,无需中间步骤。我们使用富有表现力和高度自动化的约束Horn子句逻辑进行建模,并实例化我们对solid语言的方法。我们开发了一个实现我们方法的工具,名为Solicitous,它被集成到Solidity编译器solc的SMTChecker模块中。我们在一个广泛的基准集上评估了我们的方法,该基准集包含了在27个月的时间里部署在以太坊区块链上的22,446个真实世界的智能合约。结果表明,我们的方法能够比可比的、公开可用的验证工具建立更多合同的安全性,并且在正式验证的合同的百分比上增加了一个数量级。
{"title":"A Solicitous Approach to Smart Contract Verification","authors":"Rodrigo Otoni, Matteo Marescotti, Leonardo S. Alt, P. Eugster, A. Hyvärinen, N. Sharygina","doi":"10.1145/3564699","DOIUrl":"https://doi.org/10.1145/3564699","url":null,"abstract":"Smart contracts are tempting targets of attacks, as they often hold and manipulate significant financial assets, are immutable after deployment, and have publicly available source code, with assets estimated in the order of millions of dollars being lost in the past due to vulnerabilities. Formal verification is thus a necessity, but smart contracts challenge the existing highly efficient techniques routinely applied in the symbolic verification of software, due to specificities not present in general programming languages. A common feature of existing works in this area is the attempt to reuse off-the-shelf verification tools designed for general programming languages. This reuse can lead to inefficiency and potentially unsound results, as domain translation is required. In this article, we describe a carefully crafted approach that directly models the central aspects of smart contracts natively, going from the contract to its logical representation without intermediary steps. We use the expressive and highly automatable logic of constrained Horn clauses for modeling and instantiate our approach to the Solidity language. A tool implementing our approach, called Solicitous, was developed and integrated into the SMTChecker module of the Solidity compiler solc. We evaluated our approach on an extensive benchmark set containing 22,446 real-world smart contracts deployed on the Ethereum blockchain over a 27-month period. The results show that our approach is able to establish safety of significantly more contracts than comparable, publicly available verification tools, with an order of magnitude increase in the percentage of formally verified contracts.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42255692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-Aware Anonymization of Knowledge Graphs 知识图谱的时间感知匿名化
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-09-23 DOI: https://dl.acm.org/doi/10.1145/3563694
Anh-Tu Hoang, Barbara Carminati, Elena Ferrari

Knowledge graphs (KGs) play an essential role in data sharing because they can model both users’ attributes and their relationships. KGs can tailor many data analyses, such as classification where a sensitive attribute is selected and the analyst analyzes the associations between users and the sensitive attribute’s values (aka sensitive values). Data providers anonymize their KGs and share the anonymized versions to protect users’ privacy. Unfortunately, an adversary can exploit these attributes and relationships to infer sensitive information by monitoring either one or many snapshots of a KG. To cope with this issue, in this paper, we introduce (k, l)-Sequential Attribute Degree ((k, l)-sad), an extension of the kw-tad principle[10], to ensure that sensitive values of re-identified users are diverse enough to prevent them from being inferred with a confidence higher than (frac{1}{l} ) even though adversaries monitor all published KGs. In addition, we develop the Time-Aware Knowledge Graph Anonymization Algorithm to anonymize KGs such that all published anonymized versions of a KG satisfy the (k, l)-sad principle, by, at the same time, preserving the utility of the anonymized data. We conduct experiments on four real-life datasets to show the effectiveness of our proposal and compare it with kw-tad.

知识图(Knowledge graphs, KGs)在数据共享中扮演着重要的角色,因为它可以对用户属性及其关系进行建模。kg可以定制许多数据分析,例如选择敏感属性的分类,分析人员分析用户与敏感属性值(又称敏感值)之间的关联。为了保护用户的隐私,数据提供商对其KGs进行匿名化处理,并共享匿名版本。不幸的是,攻击者可以利用这些属性和关系,通过监视KG的一个或多个快照来推断敏感信息。为了解决这个问题,本文引入了(k, l)-顺序属性度((k, l)-sad),这是对kw-tad原理[10]的扩展,以确保重新识别的用户的敏感值足够多样化,即使攻击者监视所有发布的KGs,也不会以高于(frac{1}{l} )的置信度推断出他们。我们开发了时间感知知识图匿名化算法来匿名化KG,使所有已发布的KG的匿名版本满足(k, l)-sad原则,同时保留匿名数据的效用。我们在四个实际数据集上进行了实验,以证明我们的建议的有效性,并将其与know -tad进行了比较。
{"title":"Time-Aware Anonymization of Knowledge Graphs","authors":"Anh-Tu Hoang, Barbara Carminati, Elena Ferrari","doi":"https://dl.acm.org/doi/10.1145/3563694","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3563694","url":null,"abstract":"<p>Knowledge graphs (KGs) play an essential role in data sharing because they can model both users’ attributes and their relationships. KGs can tailor many data analyses, such as classification where a sensitive attribute is selected and the analyst analyzes the associations between users and the sensitive attribute’s values (aka sensitive values). Data providers anonymize their KGs and share the anonymized versions to protect users’ privacy. Unfortunately, an adversary can exploit these attributes and relationships to infer sensitive information by monitoring either one or many snapshots of a KG. To cope with this issue, in this paper, we introduce (<i>k</i>, <i>l</i>)-Sequential Attribute Degree ((<i>k</i>, <i>l</i>)-sad), an extension of the <i>k<sup>w</sup></i>-tad principle[10], to ensure that sensitive values of re-identified users are diverse enough to prevent them from being inferred with a confidence higher than (frac{1}{l} ) even though adversaries monitor all published KGs. In addition, we develop the Time-Aware Knowledge Graph Anonymization Algorithm to anonymize KGs such that all published anonymized versions of a KG satisfy the (<i>k</i>, <i>l</i>)-sad principle, by, at the same time, preserving the utility of the anonymized data. We conduct experiments on four real-life datasets to show the effectiveness of our proposal and compare it with <i>k<sup>w</sup></i>-tad.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-Aware Anonymization of Knowledge Graphs 知识图谱的时间感知匿名化
IF 2.3 4区 计算机科学 Q1 Computer Science Pub Date : 2022-09-23 DOI: 10.1145/3563694
A. Hoang, B. Carminati, E. Ferrari
Knowledge graphs (KGs) play an essential role in data sharing because they can model both users’ attributes and their relationships. KGs can tailor many data analyses, such as classification where a sensitive attribute is selected and the analyst analyzes the associations between users and the sensitive attribute’s values (aka sensitive values). Data providers anonymize their KGs and share the anonymized versions to protect users’ privacy. Unfortunately, an adversary can exploit these attributes and relationships to infer sensitive information by monitoring either one or many snapshots of a KG. To cope with this issue, in this paper, we introduce (k, l)-Sequential Attribute Degree ((k, l)-sad), an extension of the kw-tad principle[10], to ensure that sensitive values of re-identified users are diverse enough to prevent them from being inferred with a confidence higher than (frac{1}{l} ) even though adversaries monitor all published KGs. In addition, we develop the Time-Aware Knowledge Graph Anonymization Algorithm to anonymize KGs such that all published anonymized versions of a KG satisfy the (k, l)-sad principle, by, at the same time, preserving the utility of the anonymized data. We conduct experiments on four real-life datasets to show the effectiveness of our proposal and compare it with kw-tad.
知识图(Knowledge graphs, KGs)在数据共享中扮演着重要的角色,因为它可以对用户属性及其关系进行建模。kg可以定制许多数据分析,例如选择敏感属性的分类,分析人员分析用户与敏感属性值(又称敏感值)之间的关联。为了保护用户的隐私,数据提供商对其KGs进行匿名化处理,并共享匿名版本。不幸的是,攻击者可以利用这些属性和关系,通过监视KG的一个或多个快照来推断敏感信息。为了解决这个问题,在本文中,我们引入了(k, l)-顺序属性度((k, l)-sad),这是对kw-tad原则[10]的扩展,以确保重新识别的用户的敏感值足够多样化,即使攻击者监视所有发布的KGs,也不会以高于(frac{1}{l} )的置信度推断他们。我们开发了时间感知知识图匿名化算法来匿名化KG,使所有已发布的KG的匿名版本满足(k, l)-sad原则,同时保留匿名数据的效用。我们在四个实际数据集上进行了实验,以证明我们的建议的有效性,并将其与know -tad进行了比较。
{"title":"Time-Aware Anonymization of Knowledge Graphs","authors":"A. Hoang, B. Carminati, E. Ferrari","doi":"10.1145/3563694","DOIUrl":"https://doi.org/10.1145/3563694","url":null,"abstract":"Knowledge graphs (KGs) play an essential role in data sharing because they can model both users’ attributes and their relationships. KGs can tailor many data analyses, such as classification where a sensitive attribute is selected and the analyst analyzes the associations between users and the sensitive attribute’s values (aka sensitive values). Data providers anonymize their KGs and share the anonymized versions to protect users’ privacy. Unfortunately, an adversary can exploit these attributes and relationships to infer sensitive information by monitoring either one or many snapshots of a KG. To cope with this issue, in this paper, we introduce (k, l)-Sequential Attribute Degree ((k, l)-sad), an extension of the kw-tad principle[10], to ensure that sensitive values of re-identified users are diverse enough to prevent them from being inferred with a confidence higher than (frac{1}{l} ) even though adversaries monitor all published KGs. In addition, we develop the Time-Aware Knowledge Graph Anonymization Algorithm to anonymize KGs such that all published anonymized versions of a KG satisfy the (k, l)-sad principle, by, at the same time, preserving the utility of the anonymized data. We conduct experiments on four real-life datasets to show the effectiveness of our proposal and compare it with kw-tad.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47354739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
ACM Transactions on Privacy and Security
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1