Zhenpeng Shi, Nikolay Matyunin, Kalman Graffi, David Starobinski
Security assessment relies on public information about products, vulnerabilities, and weaknesses. So far, databases in these categories have rarely been analyzed in combination. Yet, doing so could help predict unreported vulnerabilities and identify common threat patterns. In this paper, we propose a methodology for producing and optimizing a knowledge graph that aggregates knowledge from common threat databases (CVE, CWE, and CPE). We apply the threat knowledge graph to predict associations between threat databases, specifically between products, vulnerabilities, and weaknesses. We evaluate the prediction performance both in closed world with associations from the knowledge graph, and in open world with associations revealed afterward. Using rank-based metrics (i.e., Mean Rank, Mean Reciprocal Rank, and Hits@N scores), we demonstrate the ability of the threat knowledge graph to uncover many associations that are currently unknown but will be revealed in the future, which remains useful over different time periods. We propose approaches to optimize the knowledge graph, and show that they indeed help in further uncovering associations. We have made the artifacts of our work publicly available.
{"title":"Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs","authors":"Zhenpeng Shi, Nikolay Matyunin, Kalman Graffi, David Starobinski","doi":"10.1145/3641819","DOIUrl":"https://doi.org/10.1145/3641819","url":null,"abstract":"<p>Security assessment relies on public information about products, vulnerabilities, and weaknesses. So far, databases in these categories have rarely been analyzed in combination. Yet, doing so could help predict unreported vulnerabilities and identify common threat patterns. In this paper, we propose a methodology for producing and optimizing a knowledge graph that aggregates knowledge from common threat databases (CVE, CWE, and CPE). We apply the threat knowledge graph to predict associations between threat databases, specifically between products, vulnerabilities, and weaknesses. We evaluate the prediction performance both in closed world with associations from the knowledge graph, and in open world with associations revealed afterward. Using rank-based metrics (i.e., Mean Rank, Mean Reciprocal Rank, and Hits@N scores), we demonstrate the ability of the threat knowledge graph to uncover many associations that are currently unknown but will be revealed in the future, which remains useful over different time periods. We propose approaches to optimize the knowledge graph, and show that they indeed help in further uncovering associations. We have made the artifacts of our work publicly available.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"1 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bitcoin uses blockchain technology to maintain transactions order and provides probabilistic guarantees to prevent double-spending, assuming that an attacker’s computational power does not exceed 50% of the network power. In this paper, we design a novel bribery attack and show that this guarantee can be hugely undermined. Miners are assumed to be rational in this setup and they are given incentives that are dynamically calculated. In this attack, the adversary misuses the Bitcoin protocol to bribe miners and maximize their gained advantage. We will reformulate the bribery attack to propose a general mathematical foundation upon which we build multiple strategies. We show that, unlike Whale Attack, these strategies are practical, especially in the future when halvings lower the mining rewards. In the so called ’guaranteed variable-rate bribing with commitment’ strategy, through optimization by Differential Evolution (DE), we show how double spending is possible in the Bitcoin ecosystem for any transaction whose value is above 218.9BTC, and this comes with 100% success rate. A slight reduction in the success probability, e.g. by 10%, brings the threshold down to 165BTC. If the rationality assumption holds, this shows how vulnerable blockchain-based systems like Bitcoin are. We suggest a soft fork on Bitcoin to fix this issue at the end.
{"title":"Is Bitcoin Future as Secure as We Think? Analysis of Bitcoin Vulnerability to Bribery Attacks Launched through Large Transactions","authors":"Ghader Ebrahimpour, Mohammad Sayad Haghighi","doi":"10.1145/3641546","DOIUrl":"https://doi.org/10.1145/3641546","url":null,"abstract":"<p>Bitcoin uses blockchain technology to maintain transactions order and provides probabilistic guarantees to prevent double-spending, assuming that an attacker’s computational power does not exceed 50% of the network power. In this paper, we design a novel bribery attack and show that this guarantee can be hugely undermined. Miners are assumed to be rational in this setup and they are given incentives that are dynamically calculated. In this attack, the adversary misuses the Bitcoin protocol to bribe miners and maximize their gained advantage. We will reformulate the bribery attack to propose a general mathematical foundation upon which we build multiple strategies. We show that, unlike Whale Attack, these strategies are practical, especially in the future when halvings lower the mining rewards. In the so called ’guaranteed variable-rate bribing with commitment’ strategy, through optimization by Differential Evolution (DE), we show how double spending is possible in the Bitcoin ecosystem for any transaction whose value is above 218.9BTC, and this comes with 100% success rate. A slight reduction in the success probability, e.g. by 10%, brings the threshold down to 165BTC. If the rationality assumption holds, this shows how vulnerable blockchain-based systems like Bitcoin are. We suggest a soft fork on Bitcoin to fix this issue at the end.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"13 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139499276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Lightning Network (LN) is a second layer system for solving the scalability problem of Bitcoin transactions. In the current implementation of LN, channel capacity (i.e., the sum of individual balances held in the channel) is public information, while individual balances are kept secret for privacy concerns. Attackers may discover a particular balance of a channel by sending multiple fake payments through the channel. Such an attack, however, can hardly threaten the security of the LN system due to its high cost and noticeable intrusions. In this work, we present a novel non-intrusive balance tomography attack, which infers channel balances silently by performing legal transactions between two pre-created LN nodes. To minimize the cost of the attack, we propose an algorithm to compute the optimal payment amount for each transaction and design a path construction method using reinforcement learning to explore the most informative path to conduct the transactions. Finally, we propose two approaches (NIBT-RL and NIBT-RL-β) to accurately and efficiently infer all individual balances using the results of these transactions. Experiments using simulated account balances over actual LN topology show that our method can accurately infer (90%sim 94% ) of all balances in LN with around 12 USD.
{"title":"Non-Intrusive Balance Tomography Using Reinforcement Learning in the Lightning Network","authors":"Yan Qiao, Kui Wu, Majid Khabbazian","doi":"10.1145/3639366","DOIUrl":"https://doi.org/10.1145/3639366","url":null,"abstract":"<p>The Lightning Network (LN) is a second layer system for solving the scalability problem of Bitcoin transactions. In the current implementation of LN, channel capacity (i.e., the sum of individual balances held in the channel) is public information, while individual balances are kept secret for privacy concerns. Attackers may discover a particular balance of a channel by sending multiple <i>fake</i> payments through the channel. Such an attack, however, can hardly threaten the security of the LN system due to its high cost and noticeable intrusions. In this work, we present a novel <i>non-intrusive balance tomography</i> attack, which infers channel balances silently by performing legal transactions between two pre-created LN nodes. To minimize the cost of the attack, we propose an algorithm to compute the optimal payment amount for each transaction and design a path construction method using reinforcement learning to explore the most informative path to conduct the transactions. Finally, we propose two approaches (NIBT-RL and NIBT-RL-<i>β</i>) to accurately and efficiently infer all individual balances using the results of these transactions. Experiments using simulated account balances over actual LN topology show that our method can accurately infer (90%sim 94% ) of all balances in LN with around 12 USD.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"8 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139078799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liqun Chen, Changyu Dong, Christopher J. P. Newton, Yalan Wang
Group signatures and their variants have been widely used in privacy-sensitive scenarios such as anonymous authentication and attestation. In this paper, we present a new post-quantum group signature scheme from symmetric primitives. Using only symmetric primitives makes the scheme less prone to unknown attacks than basing the design on newly proposed hard problems whose security is less well-understood. However, symmetric primitives do not have rich algebraic properties, and this makes it extremely challenging to design a group signature scheme on top of them. It is even more challenging if we want a group signature scheme suitable for real-world applications, one that can support large groups and require few trust assumptions. Our scheme is based on MPC-in-the-head non-interactive zero-knowledge proofs, and we specifically design a novel hash-based group credential scheme, which is rooted in the SPHINCS+ signature scheme but with various modifications to make it MPC (multi-party computation) friendly. The security of the scheme has been proved under the fully dynamic group signature model. We provide an implementation of the scheme and demonstrate the feasibility of handling a group size as large as 260. This is the first group signature scheme from symmetric primitives that supports such a large group size and meets all the security requirements.
{"title":"Sphinx-in-the-Head: Group Signatures from Symmetric Primitives","authors":"Liqun Chen, Changyu Dong, Christopher J. P. Newton, Yalan Wang","doi":"10.1145/3638763","DOIUrl":"https://doi.org/10.1145/3638763","url":null,"abstract":"<p>Group signatures and their variants have been widely used in privacy-sensitive scenarios such as anonymous authentication and attestation. In this paper, we present a new post-quantum group signature scheme from symmetric primitives. Using only symmetric primitives makes the scheme less prone to unknown attacks than basing the design on newly proposed hard problems whose security is less well-understood. However, symmetric primitives do not have rich algebraic properties, and this makes it extremely challenging to design a group signature scheme on top of them. It is even more challenging if we want a group signature scheme suitable for real-world applications, one that can support large groups and require few trust assumptions. Our scheme is based on MPC-in-the-head non-interactive zero-knowledge proofs, and we specifically design a novel hash-based group credential scheme, which is rooted in the SPHINCS+ signature scheme but with various modifications to make it MPC (multi-party computation) friendly. The security of the scheme has been proved under the fully dynamic group signature model. We provide an implementation of the scheme and demonstrate the feasibility of handling a group size as large as 2<sup>60</sup>. This is the first group signature scheme from symmetric primitives that supports such a large group size and meets all the security requirements.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"41 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139063268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Li Wang, Xiangtao Meng, Dan Li, Xuhong Zhang, Shouling Ji, Shanqing Guo
DeepFake data contains realistically manipulated faces - its abuses pose a huge threat to the security and privacy-critical applications. Intensive research from academia and industry has produced many deepfake/detection models, leading to a constant race of attack and defense. However, due to the lack of a unified evaluation platform, many critical questions on this subject remain largely unexplored. (i) How is the anti-detection ability of the existing deepfake models? (ii) How generalizable are existing detection models against different deepfake samples? (iii) How effective are the detection APIs provided by the cloud-based vendors? (iv) How evasive and transferable are adversarial deepfakes in the lab and real-world environment? (v) How do various factors impact the performance of deepfake and detection models?
To bridge the gap, we design and implement DEEPFAKER, a unified and comprehensive deepfake-detection evaluation platform. Specifically, DEEPFAKER has integrated 10 state-of-the-art deepfake methods and 9 representative detection methods, while providing a user-friendly interface and modular design that allows for easy integration of new methods. Leveraging DEEPFAKER, we conduct a large-scale empirical study of facial deepfake/detection models and draw a set of key findings: (i) the detection methods have poor generalization on samples generated by different deepfake methods; (ii) there is no significant correlation between anti-detection ability and visual quality of deepfake samples; (iii) the current detection APIs have poor detection performance and adversarial deepfakes can achieve about 70% ASR (attack success rate) on all cloud-based vendors, calling for an urgent need to deploy effective and robust detection APIs; (iv) the detection methods in the lab are more robust against transfer attacks than the detection APIs in the real-world environment; (v) deepfake videos may not always be more difficult to detect after video compression. We envision that DEEPFAKER will benefit future research on facial deepfake and detection.
{"title":"DEEPFAKER: A Unified Evaluation Platform for Facial Deepfake and Detection Models","authors":"Li Wang, Xiangtao Meng, Dan Li, Xuhong Zhang, Shouling Ji, Shanqing Guo","doi":"10.1145/3634914","DOIUrl":"https://doi.org/10.1145/3634914","url":null,"abstract":"<p>DeepFake data contains realistically manipulated faces - its abuses pose a huge threat to the security and privacy-critical applications. Intensive research from academia and industry has produced many deepfake/detection models, leading to a constant race of attack and defense. However, due to the lack of a unified evaluation platform, many critical questions on this subject remain largely unexplored. <i>(i)</i> How is the anti-detection ability of the existing deepfake models? <i>(ii)</i> How generalizable are existing detection models against different deepfake samples? <i>(iii)</i> How effective are the detection APIs provided by the cloud-based vendors? <i>(iv)</i> How evasive and transferable are adversarial deepfakes in the lab and real-world environment? <i>(v)</i> How do various factors impact the performance of deepfake and detection models? </p><p>To bridge the gap, we design and implement <monospace>DEEPFAKER</monospace>, a unified and comprehensive deepfake-detection evaluation platform. Specifically, <monospace>DEEPFAKER</monospace> has integrated 10 state-of-the-art deepfake methods and 9 representative detection methods, while providing a user-friendly interface and modular design that allows for easy integration of new methods. Leveraging <monospace>DEEPFAKER</monospace>, we conduct a large-scale empirical study of facial deepfake/detection models and draw a set of key findings: <i>(i)</i> the detection methods have poor generalization on samples generated by different deepfake methods; <i>(ii)</i> there is no significant correlation between anti-detection ability and visual quality of deepfake samples; <i>(iii)</i> the current detection APIs have poor detection performance and adversarial deepfakes can achieve about 70% ASR (attack success rate) on all cloud-based vendors, calling for an urgent need to deploy effective and robust detection APIs; <i>(iv)</i> the detection methods in the lab are more robust against transfer attacks than the detection APIs in the real-world environment; <i>(v)</i> deepfake videos may not always be more difficult to detect after video compression. We envision that <monospace>DEEPFAKER</monospace> will benefit future research on facial deepfake and detection.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"32 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network traffic classification has many applications, such as security monitoring, quality of service, traffic engineering, etc. For the aforementioned applications, Deep Packet Inspection (DPI) is a popularly used technique for traffic classification because it scrutinizes the payload and provides comprehensive information for accurate analysis of network traffic. However, DPI-based methods reduce network performance because they are computationally expensive and hinder end-user privacy as they analyze the payload. To overcome these challenges, bit-level signatures are significantly used to perform network traffic classification. However, most of these methods still need to improve performance as they perform one-by-one signature matching of unknown payloads with application signatures for classification. Moreover, these methods become stagnant with the increase in application signatures. Therefore, to fill this gap, we propose OptiClass, an optimized classifier for application protocols using bit-level signatures. OptiClass performs parallel application signature matching with unknown flows, which results in faster, more accurate, and more efficient network traffic classification. OptiClass achieves twofold performance gains compared to the state-of-the-art methods. First, OptiClass generates bit-level signatures of just 32 bits for all the applications. This keeps OptiClass swift and privacy-preserving. Second, OptiClass uses a novel data structure called BiTSPLITTER for signature matching for fast and accurate classification. We evaluated the performance of OptiClass on three datasets consisting of twenty application protocols. Experimental results report that OptiClass has an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.
{"title":"OptiClass: An Optimized Classifier for Application Layer Protocols Using Bit Level Signatures","authors":"Mayank Swarnkar, Neha Sharma","doi":"10.1145/3633777","DOIUrl":"https://doi.org/10.1145/3633777","url":null,"abstract":"<p>Network traffic classification has many applications, such as security monitoring, quality of service, traffic engineering, etc. For the aforementioned applications, Deep Packet Inspection (DPI) is a popularly used technique for traffic classification because it scrutinizes the payload and provides comprehensive information for accurate analysis of network traffic. However, DPI-based methods reduce network performance because they are computationally expensive and hinder end-user privacy as they analyze the payload. To overcome these challenges, bit-level signatures are significantly used to perform network traffic classification. However, most of these methods still need to improve performance as they perform one-by-one signature matching of unknown payloads with application signatures for classification. Moreover, these methods become stagnant with the increase in application signatures. Therefore, to fill this gap, we propose <i>OptiClass</i>, an optimized classifier for application protocols using bit-level signatures. <i>OptiClass</i> performs parallel application signature matching with unknown flows, which results in faster, more accurate, and more efficient network traffic classification. <i>OptiClass</i> achieves twofold performance gains compared to the state-of-the-art methods. First, <i>OptiClass</i> generates bit-level signatures of just 32 bits for all the applications. This keeps <i>OptiClass</i> swift and privacy-preserving. Second, <i>OptiClass</i> uses a novel data structure called <i>BiTSPLITTER</i> for signature matching for fast and accurate classification. We evaluated the performance of <i>OptiClass</i> on three datasets consisting of twenty application protocols. Experimental results report that <i>OptiClass</i> has an average recall, precision, and F1-score of 97.36%, 97.38%, and 97.37%, respectively, and an average classification speed of 9.08 times faster than five closely related state-of-the-art methods.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"207 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maksim E. Eren, Manish Bhattarai, Robert J. Joyce, Edward Raff, Charles Nicholas, Boian S. Alexandrov
Identification of the family to which a malware specimen belongs is essential in understanding the behavior of the malware and developing mitigation strategies. Solutions proposed by prior work, however, are often not practicable due to the lack of realistic evaluation factors. These factors include learning under class imbalance, the ability to identify new malware, and the cost of production-quality labeled data. In practice, deployed models face prominent, rare, and new malware families. At the same time, obtaining a large quantity of up-to-date labeled malware for training a model can be expensive. In this article, we address these problems and propose a novel hierarchical semi-supervised algorithm, which we call the HNMFk Classifier , that can be used in the early stages of the malware family labeling process. Our method is based on non-negative matrix factorization with automatic model selection, that is, with an estimation of the number of clusters. With HNMFk Classifier , we exploit the hierarchical structure of the malware data together with a semi-supervised setup, which enables us to classify malware families under conditions of extreme class imbalance. Our solution can perform abstaining predictions, or rejection option, which yields promising results in the identification of novel malware families and helps with maintaining the performance of the model when a low quantity of labeled data is used. We perform bulk classification of nearly 2,900 both rare and prominent malware families, through static analysis, using nearly 388,000 samples from the EMBER-2018 corpus. In our experiments, we surpass both supervised and semi-supervised baseline models with an F1 score of 0.80.
{"title":"Semi-supervised Classification of Malware Families Under Extreme Class Imbalance via Hierarchical Non-Negative Matrix Factorization with Automatic Model Selection","authors":"Maksim E. Eren, Manish Bhattarai, Robert J. Joyce, Edward Raff, Charles Nicholas, Boian S. Alexandrov","doi":"10.1145/3624567","DOIUrl":"https://doi.org/10.1145/3624567","url":null,"abstract":"Identification of the family to which a malware specimen belongs is essential in understanding the behavior of the malware and developing mitigation strategies. Solutions proposed by prior work, however, are often not practicable due to the lack of realistic evaluation factors. These factors include learning under class imbalance, the ability to identify new malware, and the cost of production-quality labeled data. In practice, deployed models face prominent, rare, and new malware families. At the same time, obtaining a large quantity of up-to-date labeled malware for training a model can be expensive. In this article, we address these problems and propose a novel hierarchical semi-supervised algorithm, which we call the HNMFk Classifier , that can be used in the early stages of the malware family labeling process. Our method is based on non-negative matrix factorization with automatic model selection, that is, with an estimation of the number of clusters. With HNMFk Classifier , we exploit the hierarchical structure of the malware data together with a semi-supervised setup, which enables us to classify malware families under conditions of extreme class imbalance. Our solution can perform abstaining predictions, or rejection option, which yields promising results in the identification of novel malware families and helps with maintaining the performance of the model when a low quantity of labeled data is used. We perform bulk classification of nearly 2,900 both rare and prominent malware families, through static analysis, using nearly 388,000 samples from the EMBER-2018 corpus. In our experiments, we surpass both supervised and semi-supervised baseline models with an F1 score of 0.80.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"4 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134992865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malware is commonly used by adversaries to compromise and infiltrate cyber systems in order to steal sensitive information or destroy critical assets. Active Cyber Deception (ACD) has emerged as an effective proactive cyber defense against malware to enable misleading adversaries by presenting fake data and engaging them to learn novel attack techniques. However, real-time malware deception is a complex and challenging task because (1) it requires a comprehensive understanding of the malware behaviors at technical and tactical levels in order to create the appropriate deception ploys and resources that can leverage this behavior and mislead malware, and (2) it requires a configurable yet provably valid deception planning to guarantee effective and safe real-time deception orchestration. This article presents symbSODA, a highly configurable and verifiable cyber deception system that analyzes real-world malware using multipath execution to discover API patterns that represent attack techniques/tactics critical for deception, enables users to create their own customized deception ploys based on the malware type and objectives, allows for constructing conflict-free Deception Playbooks , and finally automates the deception orchestration to execute the malware inside a deceptive environment. symbSODA extracts Malicious Sub-graphs (MSGs) consisting of WinAPIs from real-world malware and maps them to tactics and techniques using the ATT&CK framework to facilitate the construction of meaningful user-defined deception playbooks. We conducted a comprehensive evaluation study on symbSODA using 255 recent malware samples. We demonstrated that the accuracy of the end-to-end malware deception is 95% on average, with negligible overhead using various deception goals and strategies. Furthermore, our approach successfully extracted MSGs with a 97% recall, and our MSG-to-MITRE mapping achieved a top-1 accuracy of 88.75%. Our study suggests that symbSODA can serve as a general-purpose Malware Deception Factory to automatically produce customized deception playbooks against arbitrary malware behavior.
{"title":"symbSODA: Configurable and Verifiable Orchestration Automation for Active Malware Deception","authors":"Md Sajidul Islam Sajid, Jinpeng Wei, Ehab Al-Shaer, Qi Duan, Basel Abdeen, Latifur Khan","doi":"10.1145/3624568","DOIUrl":"https://doi.org/10.1145/3624568","url":null,"abstract":"Malware is commonly used by adversaries to compromise and infiltrate cyber systems in order to steal sensitive information or destroy critical assets. Active Cyber Deception (ACD) has emerged as an effective proactive cyber defense against malware to enable misleading adversaries by presenting fake data and engaging them to learn novel attack techniques. However, real-time malware deception is a complex and challenging task because (1) it requires a comprehensive understanding of the malware behaviors at technical and tactical levels in order to create the appropriate deception ploys and resources that can leverage this behavior and mislead malware, and (2) it requires a configurable yet provably valid deception planning to guarantee effective and safe real-time deception orchestration. This article presents symbSODA, a highly configurable and verifiable cyber deception system that analyzes real-world malware using multipath execution to discover API patterns that represent attack techniques/tactics critical for deception, enables users to create their own customized deception ploys based on the malware type and objectives, allows for constructing conflict-free Deception Playbooks , and finally automates the deception orchestration to execute the malware inside a deceptive environment. symbSODA extracts Malicious Sub-graphs (MSGs) consisting of WinAPIs from real-world malware and maps them to tactics and techniques using the ATT&CK framework to facilitate the construction of meaningful user-defined deception playbooks. We conducted a comprehensive evaluation study on symbSODA using 255 recent malware samples. We demonstrated that the accuracy of the end-to-end malware deception is 95% on average, with negligible overhead using various deception goals and strategies. Furthermore, our approach successfully extracted MSGs with a 97% recall, and our MSG-to-MITRE mapping achieved a top-1 accuracy of 88.75%. Our study suggests that symbSODA can serve as a general-purpose Malware Deception Factory to automatically produce customized deception playbooks against arbitrary malware behavior.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"3 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134992869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shahnewaz Karim Sakib, George T Amariucai, Yong Guan
Information leakage is usually defined as the logarithmic increment in the adversary’s probability of correctly guessing the legitimate user’s private data or some arbitrary function of the private data when presented with the legitimate user’s publicly disclosed information. However, this definition of information leakage implicitly assumes that both the privacy mechanism and the prior probability of the original data are entirely known to the attacker. In reality, the assumption of complete knowledge of the privacy mechanism for an attacker is often impractical. The attacker can usually have access to only an approximate version of the correct privacy mechanism, computed from a limited set of the disclosed data, for which they can access the corresponding un-distorted data. In this scenario, the conventional definition of leakage no longer has an operational meaning. To address this problem, in this article, we propose novel meaningful information-theoretic metrics for information leakage when the attacker has incomplete information about the privacy mechanism—we call them average subjective leakage , average confidence boost , and average objective leakage , respectively. For the simplest, binary scenario, we demonstrate how to find an optimized privacy mechanism that minimizes the worst-case value of either of these leakages.
{"title":"Measures of Information Leakage for Incomplete Statistical Information: Application to a Binary Privacy Mechanism","authors":"Shahnewaz Karim Sakib, George T Amariucai, Yong Guan","doi":"10.1145/3624982","DOIUrl":"https://doi.org/10.1145/3624982","url":null,"abstract":"Information leakage is usually defined as the logarithmic increment in the adversary’s probability of correctly guessing the legitimate user’s private data or some arbitrary function of the private data when presented with the legitimate user’s publicly disclosed information. However, this definition of information leakage implicitly assumes that both the privacy mechanism and the prior probability of the original data are entirely known to the attacker. In reality, the assumption of complete knowledge of the privacy mechanism for an attacker is often impractical. The attacker can usually have access to only an approximate version of the correct privacy mechanism, computed from a limited set of the disclosed data, for which they can access the corresponding un-distorted data. In this scenario, the conventional definition of leakage no longer has an operational meaning. To address this problem, in this article, we propose novel meaningful information-theoretic metrics for information leakage when the attacker has incomplete information about the privacy mechanism—we call them average subjective leakage , average confidence boost , and average objective leakage , respectively. For the simplest, binary scenario, we demonstrate how to find an optimized privacy mechanism that minimizes the worst-case value of either of these leakages.","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"3 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134992867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prakash Shrestha, Ahmed Tanvir Mahdad, Nitesh Saxena
Reducing the level of user effort involved in traditional two-factor authentication (TFA) constitutes an important research topic. An interesting representative approach, Sound-Proof , leverages ambient sounds to detect the proximity between the second-factor device (phone) and the login terminal (browser), and eliminates the need for the user to transfer PIN codes. In this paper, we identify a weakness of the Sound-Proof system that makes it completely vulnerable to passive “environment guessing” and active “environment manipulating” remote attackers and proximity attackers. Addressing these security issues, we propose Listening-Watch , a new TFA mechanism based on a wearable device (watch/bracelet) and active browser-generated random speech sounds. As the user attempts to log in, the browser populates a short random code encoded into speech, and the login succeeds if the watch’s audio recording contains this code (decoded using speech recognition ), and is similar enough to the browser’s audio recording. The remote attacker, who has guessed/manipulated the user’s environment, will be defeated since authentication success relies upon the presence of the random code in watch’s recordings. The proximity attacker will also be defeated unless it is extremely close (< 50 cm) to the watch since the wearable microphones are usually designed to capture only nearby sounds (e.g., voice commands).
{"title":"Sound-based Two-Factor Authentication: Vulnerabilities and Redesign","authors":"Prakash Shrestha, Ahmed Tanvir Mahdad, Nitesh Saxena","doi":"10.1145/3632175","DOIUrl":"https://doi.org/10.1145/3632175","url":null,"abstract":"Reducing the level of user effort involved in traditional two-factor authentication (TFA) constitutes an important research topic. An interesting representative approach, Sound-Proof , leverages ambient sounds to detect the proximity between the second-factor device (phone) and the login terminal (browser), and eliminates the need for the user to transfer PIN codes. In this paper, we identify a weakness of the Sound-Proof system that makes it completely vulnerable to passive “environment guessing” and active “environment manipulating” remote attackers and proximity attackers. Addressing these security issues, we propose Listening-Watch , a new TFA mechanism based on a wearable device (watch/bracelet) and active browser-generated random speech sounds. As the user attempts to log in, the browser populates a short random code encoded into speech, and the login succeeds if the watch’s audio recording contains this code (decoded using speech recognition ), and is similar enough to the browser’s audio recording. The remote attacker, who has guessed/manipulated the user’s environment, will be defeated since authentication success relies upon the presence of the random code in watch’s recordings. The proximity attacker will also be defeated unless it is extremely close (< 50 cm) to the watch since the wearable microphones are usually designed to capture only nearby sounds (e.g., voice commands).","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"1 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135041828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}