Pub Date : 2024-08-08DOI: 10.1016/j.cose.2024.104036
Intelligent diagnostic modeling of industrial equipment (IDMIE) addresses various industrial challenges, yet concerns about data privacy security have been raised by many organizations. However, the reliance on third-party trust and the stringent privacy requirements pose obstacles to ensuring privacy. To tackle these issues, this study proposes a generative model based on the framework of differential privacy and one-dimensional operational generative adversarial networks (DP1D-OpGAN), in which, in order to reduce the privacy budget and ensure the privacy of the generative model, a method involving training the learning parameters with perturbed gradient vectors is proposed. Additionally, the classification model of discrete multi-wavelet transforms convolutional neural network (DMWA-CNN) is integrated to enhance the diagnostic performance of the model. The model's safety, high performance, and generalizability are validated through multiple comprehensive experiments.
{"title":"An intelligent diagnostic model for industrial equipment with privacy protection","authors":"","doi":"10.1016/j.cose.2024.104036","DOIUrl":"10.1016/j.cose.2024.104036","url":null,"abstract":"<div><p>Intelligent diagnostic modeling of industrial equipment (IDMIE) addresses various industrial challenges, yet concerns about data privacy security have been raised by many organizations. However, the reliance on third-party trust and the stringent privacy requirements pose obstacles to ensuring privacy. To tackle these issues, this study proposes a generative model based on the framework of differential privacy and one-dimensional operational generative adversarial networks (DP1D-OpGAN), in which, in order to reduce the privacy budget and ensure the privacy of the generative model, a method involving training the learning parameters with perturbed gradient vectors is proposed. Additionally, the classification model of discrete multi-wavelet transforms convolutional neural network (DMWA-CNN) is integrated to enhance the diagnostic performance of the model. The model's safety, high performance, and generalizability are validated through multiple comprehensive experiments.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141997782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-08DOI: 10.1016/j.cose.2024.104042
Cyber attacks against power grids, interrupting utility service and causing blackouts are on the rise, and increasingly motivate researchers to investigate this topic. Thereby, models of real-world power grids are an indispensable prerequisite, but operators do not make them available, allegedly for reasons of protection. This security-by-obscurity strategy appears futile as grid artifacts (lines, plants, substations) are large and cannot be easily hidden. It seems promising to infer real-world model data from publicly available data, and indeed, multiple models were generated through Open Source Intelligence (OSINT). Questions on the models’ quality remain, however, open but are of utter importance for research building on these models, especially as the results might have considerable impact on society and national security. This paper approaches this particular point and investigates whether OSINT leads to data on real-world power grids of sufficient quality; by the example of the European country of Austria, we investigate whether all parameters that are relevant for power flow analysis, a standard approach in power engineering, can be inferred from publicly available data (OpenStreetMap, national statistics, etc.), and validate this data against ground truths, including governmental land use plans, Google Street View and the power sector’s information material. Our validation shows that the inferred data meets reality well — among others, the extra-high voltage level is 100% (lines) rsp. 98% (substations) complete. Beyond, the inferred data is up-to-date as the construction of lines or substations is always documented in OSM, in 76% of the cases even before finalization of the construction works. An analysis of 24 other European countries revealed that electric systems, substations, and power plants are documented in OSM to a similar extent as in Austria, motivating the application of our approach also to these countries. The contribution of our OSINT-based approach is twofold: First, it facilitates the development of models of real-world power grids, fostering research and discussion that is independent of the power grid operators, in the security domain and beyond. Second, our method represents an attack itself, challenging the energy sector’s security-by-obscurity approach.
针对电网的网络攻击、中断电力服务和造成停电的事件日益增多,越来越多的研究人员开始研究这一课题。因此,现实世界中的电网模型是不可或缺的先决条件,但运营商并不提供这些模型,据称是出于保护的考虑。由于电网人工制品(线路、发电厂、变电站)体积庞大,不易隐藏,因此这种 "隐蔽安全 "策略似乎是徒劳的。从公开数据中推断真实世界的模型数据似乎大有可为,事实上,多个模型都是通过开源情报(OSINT)生成的。然而,有关模型质量的问题仍未解决,但对基于这些模型的研究却极为重要,尤其是因为研究结果可能会对社会和国家安全产生重大影响。本文从这一特殊的角度出发,研究 OSINT 是否能提供具有足够质量的真实电网数据;以欧洲国家奥地利为例,我们研究了是否能从公开数据(OpenStreetMap、国家统计数据等)中推断出与电力工程标准方法--电力流分析--相关的所有参数,并将这些数据与政府土地利用规划、谷歌街景和电力部门的信息资料等地面事实进行验证。我们的验证结果表明,推断出的数据与实际情况非常吻合--其中,特高压水平为 100%(线路)rsp.98%(变电站)。此外,推断的数据也是最新的,因为线路或变电站的建设始终记录在 OSM 中,在 76% 的情况下,甚至在建设工程最终完成之前。对其他 24 个欧洲国家的分析表明,电力系统、变电站和发电厂在 OSM 中的记录程度与奥地利相似,因此我们的方法也适用于这些国家。我们基于 OSINT 的方法有两方面的贡献:首先,它有助于开发真实世界电网的模型,促进独立于电网运营商的安全领域内外的研究和讨论。其次,我们的方法本身就是一种攻击,对能源部门的 "逐一安全 "方法提出了挑战。
{"title":"Fostering security research in the energy sector: A validation of open source intelligence for power grid model data","authors":"","doi":"10.1016/j.cose.2024.104042","DOIUrl":"10.1016/j.cose.2024.104042","url":null,"abstract":"<div><p>Cyber attacks against power grids, interrupting utility service and causing blackouts are on the rise, and increasingly motivate researchers to investigate this topic. Thereby, models of real-world power grids are an indispensable prerequisite, but operators do not make them available, allegedly for reasons of protection. This security-by-obscurity strategy appears futile as grid artifacts (lines, plants, substations) are large and cannot be easily hidden. It seems promising to infer real-world model data from publicly available data, and indeed, multiple models were generated through Open Source Intelligence (OSINT). Questions on the models’ quality remain, however, open but are of utter importance for research building on these models, especially as the results might have considerable impact on society and national security. This paper approaches this particular point and investigates whether OSINT leads to data on real-world power grids of sufficient quality; by the example of the European country of Austria, we investigate whether all parameters that are relevant for power flow analysis, a standard approach in power engineering, can be inferred from publicly available data (OpenStreetMap, national statistics, etc.), and validate this data against ground truths, including governmental land use plans, Google Street View and the power sector’s information material. Our validation shows that the inferred data meets reality well — among others, the extra-high voltage level is 100% (lines) rsp. 98% (substations) complete. Beyond, the inferred data is up-to-date as the construction of lines or substations is always documented in OSM, in 76% of the cases even before finalization of the construction works. An analysis of 24 other European countries revealed that electric systems, substations, and power plants are documented in OSM to a similar extent as in Austria, motivating the application of our approach also to these countries. The contribution of our OSINT-based approach is twofold: First, it facilitates the development of models of real-world power grids, fostering research and discussion that is independent of the power grid operators, in the security domain and beyond. Second, our method represents an attack itself, challenging the energy sector’s security-by-obscurity approach.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S016740482400347X/pdfft?md5=522880a5fc919ec2227cfbfbeb5e4de3&pid=1-s2.0-S016740482400347X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141993260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-07DOI: 10.1016/j.cose.2024.104040
Despite extensive academic research in anomaly detection within the cybersecurity domain, its successful adoption in real-world settings remains limited. This paper addresses the challenges of applying outlier detection techniques for threat detection within the context of Security Information and Event Management (SIEM) systems. It particularly highlights the significance of contextualization and explainability, while challenging the assumption that outliers invariably indicate malicious activity. It proposes a simple yet effective outlier detection technique designed to mimic a Security Operation Center (SOC) analyst’s reasoning process in finding anomalies/outliers and deciding maliciousness. The approach emphasizes explainability and simplicity, achieved by combining the output of simple, context-aware univariate submodels that calculate an outlier score for each entry.
The proposed technique is first evaluated on a public dataset, demonstrating its ability to achieve high performance in detecting outliers compared to other well-known algorithms. Furthermore, to assess the practicality in a real-world scenario, the approach is deployed in production alongside the SIEM of a large international enterprise with over 100,000 assets, utilizing 20 terabytes of Endpoint Detection and Response (EDR) logs to detect Living-off-the-Land Binaries (LOLBins). The proposed framework can empower SOC analysts in developing scalable, effective, and interpretable outlier-based threat detection use cases.
{"title":"HEOD: Human-assisted Ensemble Outlier Detection for cybersecurity","authors":"","doi":"10.1016/j.cose.2024.104040","DOIUrl":"10.1016/j.cose.2024.104040","url":null,"abstract":"<div><p>Despite extensive academic research in anomaly detection within the cybersecurity domain, its successful adoption in real-world settings remains limited. This paper addresses the challenges of applying outlier detection techniques for threat detection within the context of Security Information and Event Management (SIEM) systems. It particularly highlights the significance of contextualization and explainability, while challenging the assumption that outliers invariably indicate malicious activity. It proposes a simple yet effective outlier detection technique designed to mimic a Security Operation Center (SOC) analyst’s reasoning process in finding anomalies/outliers and deciding maliciousness. The approach emphasizes explainability and simplicity, achieved by combining the output of simple, context-aware univariate submodels that calculate an outlier score for each entry.</p><p>The proposed technique is first evaluated on a public dataset, demonstrating its ability to achieve high performance in detecting outliers compared to other well-known algorithms. Furthermore, to assess the practicality in a real-world scenario, the approach is deployed in production alongside the SIEM of a large international enterprise with over 100,000 assets, utilizing 20 terabytes of Endpoint Detection and Response (EDR) logs to detect Living-off-the-Land Binaries (LOLBins). The proposed framework can empower SOC analysts in developing scalable, effective, and interpretable outlier-based threat detection use cases.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142012388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1016/j.cose.2024.104039
Truth discovery technology is widely used in the field of data collection in crowdsourcing; however, with the deepening of people’s privacy awareness, ordinary truth discovery can no longer meet the current user demand for privacy protection, and solving the privacy problem is one of the critical challenges of truth discovery. A number of works have been proposed as the truth discovery mechanism of differential privacy. However, the privacy budget allocation in the known differential privacy truth discovery mechanisms does not consider the data precision differences of multi-source datasets and the anomalous results of small-value data after aggregation. We propose a precision division based on a differential privacy two-layer truth discovery framework to ensure privacy security while considering data precision and can also ensure high-accuracy truth discovery. The main challenges of this paper are how to obtain highly accurate truth values in sparse data scenarios without exposing the original values to the cloud server when performing error correction of aggregated anomalies after noise addition, as well as to improve the precision of truth value estimation of perturbed streaming data based on the refinement of the privacy protection level based on the accuracy of the data. Specifically, we first formulate a data-sampling algorithm to get the data precision of different users and to sieve out the anomalous data and duplicate data to obtain the quality of user-uploaded data. Then, we formulate a new privacy budget allocation mechanism, which synthesizes the sampling situation during data preprocessing and fully considers the data precision to quantify user privacy and turn it into specific values. We provide a quadratic truth discovery mechanism based on a predictive interpolation algorithm when dealing with small-value data, ensuring the reliability of small data aggregation results. We demonstrate that our framework achieves differential privacy for user-supplied data while we conduct extensive experiments on three real-world datasets to prove the effectiveness of our system framework.
{"title":"Privacy-preserving quadratic truth discovery based on Precision partitioning","authors":"","doi":"10.1016/j.cose.2024.104039","DOIUrl":"10.1016/j.cose.2024.104039","url":null,"abstract":"<div><p>Truth discovery technology is widely used in the field of data collection in crowdsourcing; however, with the deepening of people’s privacy awareness, ordinary truth discovery can no longer meet the current user demand for privacy protection, and solving the privacy problem is one of the critical challenges of truth discovery. A number of works have been proposed as the truth discovery mechanism of differential privacy. However, the privacy budget allocation in the known differential privacy truth discovery mechanisms does not consider the data precision differences of multi-source datasets and the anomalous results of small-value data after aggregation. We propose a precision division based on a differential privacy two-layer truth discovery framework to ensure privacy security while considering data precision and can also ensure high-accuracy truth discovery. The main challenges of this paper are how to obtain highly accurate truth values in sparse data scenarios without exposing the original values to the cloud server when performing error correction of aggregated anomalies after noise addition, as well as to improve the precision of truth value estimation of perturbed streaming data based on the refinement of the privacy protection level based on the accuracy of the data. Specifically, we first formulate a data-sampling algorithm to get the data precision of different users and to sieve out the anomalous data and duplicate data to obtain the quality of user-uploaded data. Then, we formulate a new privacy budget allocation mechanism, which synthesizes the sampling situation during data preprocessing and fully considers the data precision to quantify user privacy and turn it into specific values. We provide a quadratic truth discovery mechanism based on a predictive interpolation algorithm when dealing with small-value data, ensuring the reliability of small data aggregation results. We demonstrate that our framework achieves differential privacy for user-supplied data while we conduct extensive experiments on three real-world datasets to prove the effectiveness of our system framework.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1016/j.cose.2024.104034
The Internet of Things (IoT) devices have been integrated into almost all everyday applications of human life such as healthcare, transportation and agriculture. This widespread adoption of IoT has opened a large threat landscape to computer networks, leaving security gaps in IoT-enabled networks. These resource-constrained devices lack sufficient security mechanisms and become the weakest link in our in computer networks and jeopardize systems and data. To address this issue, Intrusion Detection Systems (IDS) have been proposed as one of many tools to mitigate IoT related intrusions. While IDS have proven to be a crucial tools for threat detection, their dependence on labeled data and their high computational costs have become obstacles to real life adoption. In this work, we present IoT-PRIDS, a new framework equipped with a host-based anomaly-based intrusion detection system that leverages “packet representations” to understand the typical behavior of devices, focusing on their communications, services, and packet header values. It is a lightweight non-ML model that relies solely on benign network traffic for intrusion detection and offers a practical way for securing IoT environments. Our results show that this model can detect the majority of abnormal flows while keeping false alarms at a minimum and is promising to be used in real-world applications.
{"title":"IoT-PRIDS: Leveraging packet representations for intrusion detection in IoT networks","authors":"","doi":"10.1016/j.cose.2024.104034","DOIUrl":"10.1016/j.cose.2024.104034","url":null,"abstract":"<div><p>The Internet of Things (IoT) devices have been integrated into almost all everyday applications of human life such as healthcare, transportation and agriculture. This widespread adoption of IoT has opened a large threat landscape to computer networks, leaving security gaps in IoT-enabled networks. These resource-constrained devices lack sufficient security mechanisms and become the weakest link in our in computer networks and jeopardize systems and data. To address this issue, Intrusion Detection Systems (IDS) have been proposed as one of many tools to mitigate IoT related intrusions. While IDS have proven to be a crucial tools for threat detection, their dependence on labeled data and their high computational costs have become obstacles to real life adoption. In this work, we present IoT-PRIDS, a new framework equipped with a host-based anomaly-based intrusion detection system that leverages “packet representations” to understand the typical behavior of devices, focusing on their communications, services, and packet header values. It is a lightweight non-ML model that relies solely on benign network traffic for intrusion detection and offers a practical way for securing IoT environments. Our results show that this model can detect the majority of abnormal flows while keeping false alarms at a minimum and is promising to be used in real-world applications.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167404824003390/pdfft?md5=03331373bd1c52656655f10cadac394d&pid=1-s2.0-S0167404824003390-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141954127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1016/j.cose.2024.104033
The increasing dependence on satellite technology for critical applications, such as telecommunications, Earth observation, and navigation, underscores the need for robust security measures to safeguard these assets from potential cyber threats. Moreover, as many satellite systems rely on the Controller Area Network (CAN) protocol for efficient data exchange among onboard subsystems, they become prime targets for cyberattacks. While contributions present various options for detecting attacks in the CAN bus, no one proposes an architecture suitable for satellite systems. To address this concern, this paper presents a novel approach to develop an adaptive distributed Intrusion Detection System (IDS) for satellites, which integrates machine and deep learning techniques for the classification of CAN frames. This system is specifically designed to overcome the inherent power and computational challenges of satellite operations by executing time-based anomaly detection on board, and content-based detection at the ground segment. To evaluate the effectiveness of the proposed solution, experiments are conducted using representative Datasets. The obtained results demonstrate that the distributed IDS presented in this research offers a promising solution to improve the security of satellite systems by achieving high detection rates ranging from 91.12% to 99.86% (F1-score).
电信、地球观测和导航等关键应用越来越依赖卫星技术,这凸显了采取强有力的安全措施保护这些资产免受潜在网络威胁的必要性。此外,由于许多卫星系统依赖控制器局域网(CAN)协议在星载子系统之间进行高效的数据交换,因此它们成为网络攻击的主要目标。虽然已有论文提出了各种检测 CAN 总线攻击的方案,但没有人提出适合卫星系统的架构。为了解决这一问题,本文提出了一种新方法,为卫星开发自适应分布式入侵检测系统(IDS),该系统集成了机器学习和深度学习技术,用于 CAN 帧的分类。该系统通过在卫星上执行基于时间的异常检测和在地面段执行基于内容的检测,克服了卫星运行固有的功率和计算挑战。为评估所提解决方案的有效性,我们使用具有代表性的数据集进行了实验。实验结果表明,本研究提出的分布式 IDS 能够实现 91.12% 到 99.86% 的高检测率(F1-分数),为提高卫星系统的安全性提供了一个前景广阔的解决方案。
{"title":"CANSat-IDS: An adaptive distributed Intrusion Detection System for satellites, based on combined classification of CAN traffic","authors":"","doi":"10.1016/j.cose.2024.104033","DOIUrl":"10.1016/j.cose.2024.104033","url":null,"abstract":"<div><p>The increasing dependence on satellite technology for critical applications, such as telecommunications, Earth observation, and navigation, underscores the need for robust security measures to safeguard these assets from potential cyber threats. Moreover, as many satellite systems rely on the Controller Area Network (CAN) protocol for efficient data exchange among onboard subsystems, they become prime targets for cyberattacks. While contributions present various options for detecting attacks in the CAN bus, no one proposes an architecture suitable for satellite systems. To address this concern, this paper presents a novel approach to develop an adaptive distributed Intrusion Detection System (IDS) for satellites, which integrates machine and deep learning techniques for the classification of CAN frames. This system is specifically designed to overcome the inherent power and computational challenges of satellite operations by executing time-based anomaly detection on board, and content-based detection at the ground segment. To evaluate the effectiveness of the proposed solution, experiments are conducted using representative Datasets. The obtained results demonstrate that the distributed IDS presented in this research offers a promising solution to improve the security of satellite systems by achieving high detection rates ranging from 91.12% to 99.86% (F1-score).</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1016/j.cose.2024.104031
The digital landscape faces an escalating wave of sophisticated malware threats to organizations and individuals, and it is increasingly vulnerable to cyber attacks. The dominance of Windows operating systems across corporate and individual computing environments renders Windows a prime target for cyber threats. As malware increasingly employs advanced code obfuscation and packing techniques to evade static detection, dynamic analysis through API calls has become more helpful in identifying malicious behavior. The deep learning techniques emerge as a promising strategy, significantly advancing the field of malware detection in response to the ever-evolving cyber threat landscape. Leveraging advanced deep learning techniques, we introduce a cutting-edge malware detection framework that utilizes the Longformer model, specifically designed to handle extensive text sequences. Our novel approach transforms API call sequences into detailed natural language descriptions with the help of API descriptions and arguments, thereby enabling a deeper understanding of software behaviors. This transformation allows the Longformer model to identify malicious patterns, offering enhanced detection accuracy efficiently. Comparative analyses with state-of-the-art techniques and conventional deep learning models reveal that our proposed method showcases significant performance improvements in terms of accuracy, precision, recall, and F1 score. The proposed model achieves an accuracy of 0.992, highlighting its efficacy in accurately identifying and classifying malicious behavior.
组织和个人面临的复杂恶意软件威胁不断升级,数字环境越来越容易受到网络攻击。Windows 操作系统在企业和个人计算环境中的主导地位使 Windows 成为网络威胁的主要目标。随着恶意软件越来越多地采用先进的代码混淆和打包技术来躲避静态检测,通过 API 调用进行动态分析更有助于识别恶意行为。深度学习技术是一种很有前途的策略,它大大推动了恶意软件检测领域的发展,以应对不断变化的网络威胁环境。利用先进的深度学习技术,我们推出了一种前沿的恶意软件检测框架,该框架利用专门设计用于处理大量文本序列的 Longformer 模型。我们的新方法借助 API 描述和参数,将 API 调用序列转换为详细的自然语言描述,从而加深对软件行为的理解。这种转换使 Longformer 模型能够识别恶意模式,从而有效提高检测准确性。与最先进技术和传统深度学习模型的对比分析表明,我们提出的方法在准确率、精确度、召回率和 F1 分数方面都有显著的性能提升。所提模型的准确率达到了 0.992,凸显了其在准确识别和分类恶意行为方面的功效。
{"title":"BD-MDLC: Behavior description-based enhanced malware detection for windows environment using longformer classifier","authors":"","doi":"10.1016/j.cose.2024.104031","DOIUrl":"10.1016/j.cose.2024.104031","url":null,"abstract":"<div><p>The digital landscape faces an escalating wave of sophisticated malware threats to organizations and individuals, and it is increasingly vulnerable to cyber attacks. The dominance of Windows operating systems across corporate and individual computing environments renders Windows a prime target for cyber threats. As malware increasingly employs advanced code obfuscation and packing techniques to evade static detection, dynamic analysis through API calls has become more helpful in identifying malicious behavior. The deep learning techniques emerge as a promising strategy, significantly advancing the field of malware detection in response to the ever-evolving cyber threat landscape. Leveraging advanced deep learning techniques, we introduce a cutting-edge malware detection framework that utilizes the Longformer model, specifically designed to handle extensive text sequences. Our novel approach transforms API call sequences into detailed natural language descriptions with the help of API descriptions and arguments, thereby enabling a deeper understanding of software behaviors. This transformation allows the Longformer model to identify malicious patterns, offering enhanced detection accuracy efficiently. Comparative analyses with state-of-the-art techniques and conventional deep learning models reveal that our proposed method showcases significant performance improvements in terms of accuracy, precision, recall, and F1 score. The proposed model achieves an accuracy of 0.992, highlighting its efficacy in accurately identifying and classifying malicious behavior.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-02DOI: 10.1016/j.cose.2024.104030
Due to their open architecture, collaborative filtering recommender systems are susceptible to recommendation attacks, in which attackers inject fake rating data into the system to affect the accuracy of recommendation results. To detect these attacks, numerous detection methods have been designed and proven effective. However, in recent years, deep learning-based recommendation attack models such as GSA-GAN have shown higher concealment, posing new challenges to existing detection methods. Motivated by the need for improved detection, in this paper we propose a new approach called CNN-BAG, which integrates convolutional neural network (CNN) and Bagging (BAG) techniques. CNN-BAG can enhance the detection performance by simultaneously leveraging the deep learning capabilities of CNN and the ensemble learning strengths of Bagging. Firstly, we construct a deep neural network based on CNN as the base learner to automatically extract and learn features of recommendation attacks. Secondly, we use the Bagging algorithm to perform bootstrap sampling on the training data to generate multiple diverse training subsets. The above constructed base learners are then trained on these generated training subsets to produce multiple base classifiers for classifying recommendation attacks. Finally, we combine the base classifiers’ outputs using a majority voting method to obtain the final detection results. To assess the performance of CNN-BAG in detecting recommendation attacks, we compared it against several well-established detection methods on the Movielens-10M and Amazon datasets. Our experiments revealed that CNN-BAG is adept at identifying various attack types, including the deep learning-based recommendation attack models.
{"title":"A recommendation attack detection approach integrating CNN with Bagging","authors":"","doi":"10.1016/j.cose.2024.104030","DOIUrl":"10.1016/j.cose.2024.104030","url":null,"abstract":"<div><p>Due to their open architecture, collaborative filtering recommender systems are susceptible to recommendation attacks, in which attackers inject fake rating data into the system to affect the accuracy of recommendation results. To detect these attacks, numerous detection methods have been designed and proven effective. However, in recent years, deep learning-based recommendation attack models such as GSA-GAN have shown higher concealment, posing new challenges to existing detection methods. Motivated by the need for improved detection, in this paper we propose a new approach called CNN-BAG, which integrates convolutional neural network (CNN) and Bagging (BAG) techniques. CNN-BAG can enhance the detection performance by simultaneously leveraging the deep learning capabilities of CNN and the ensemble learning strengths of Bagging. Firstly, we construct a deep neural network based on CNN as the base learner to automatically extract and learn features of recommendation attacks. Secondly, we use the Bagging algorithm to perform bootstrap sampling on the training data to generate multiple diverse training subsets. The above constructed base learners are then trained on these generated training subsets to produce multiple base classifiers for classifying recommendation attacks. Finally, we combine the base classifiers’ outputs using a majority voting method to obtain the final detection results. To assess the performance of CNN-BAG in detecting recommendation attacks, we compared it against several well-established detection methods on the Movielens-10M and Amazon datasets. Our experiments revealed that CNN-BAG is adept at identifying various attack types, including the deep learning-based recommendation attack models.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-30DOI: 10.1016/j.cose.2024.104026
This study was prompted by the scarcity of focused quantitative research on the cybersecurity of SMBs. Our research aimed to understand the factors influencing SMBs' approach to cybersecurity, their level of threat awareness and the importance placed on cybersecurity. It also explored the extent to which NIST CSF practices are implemented by SMBs while also detecting and ranking the prevalent challenges faced by SMBs. Additionally, resources that SMBs turn to for help and guidance were also evaluated. While the survey-based study was on Western Australian SMBs, the results are of more general and wider interest. Our study found the lack of funds to be the biggest hindrance to cybersecurity, along with a lack of knowledge on where to start implementing good security practices. SMBs also lacked familiarity with relevant regulations and frameworks. The study highlights areas for improvement, such as access control mechanisms, individual user accounts, formalised policies and procedures, and dedicated budgets. SMBs heavily rely on Google search for cybersecurity information, emphasising the need for optimised search results from authoritative sources. IT service providers and informal networks also emerge as important sources of cybersecurity guidance, while local universities could assist SMBs but remain underutilised in this regard. Interestingly, factors such as organisational size, industry sector, and revenue level did not significantly impact SMBs' perception of vulnerability to cyber threats. However, further investigation is needed to evaluate the effectiveness of different IT service models for SMBs' cybersecurity needs. Overall, the research provides valuable insights into the specific gaps and challenges faced by SMBs in the cybersecurity domain, as well as their preferred methods of seeking and consuming cybersecurity assistance. The findings can guide the development of targeted strategies and policies to enhance the cybersecurity posture of SMBs.
这项研究的起因是,有关中小企业网络安全的重点定量研究很少。我们的研究旨在了解影响中小企业网络安全方法的因素、它们对威胁的认识水平以及对网络安全的重视程度。研究还探讨了中小企业实施 NIST CSF 实践的程度,同时还对中小企业面临的普遍挑战进行了检测和排名。此外,还对中小企业寻求帮助和指导的资源进行了评估。虽然这项基于调查的研究针对的是西澳大利亚州的中小企业,但研究结果却具有更广泛的普遍意义。我们的研究发现,缺乏资金是网络安全的最大障碍,此外,中小型企业还缺乏从何处开始实施良好安全措施的知识。中小型企业也不熟悉相关法规和框架。研究强调了需要改进的方面,如访问控制机制、个人用户账户、正式的政策和程序以及专项预算。中小企业在很大程度上依赖谷歌搜索来获取网络安全信息,这强调了从权威来源优化搜索结果的必要性。IT 服务提供商和非正式网络也是网络安全指导的重要来源,而本地大学可以为中小企业提供帮助,但在这方面仍未得到充分利用。有趣的是,组织规模、行业部门和收入水平等因素并未对中小企业对网络威胁脆弱性的认知产生重大影响。不过,要评估不同 IT 服务模式对满足中小企业网络安全需求的有效性,还需要进一步调查。总之,研究为了解中小企业在网络安全领域面临的具体差距和挑战,以及他们寻求和消费网络安全援助的首选方法提供了宝贵的见解。研究结果可指导制定有针对性的战略和政策,以增强中小企业的网络安全态势。
{"title":"Cybersecurity preparedness of small-to-medium businesses: A Western Australia study with broader implications","authors":"","doi":"10.1016/j.cose.2024.104026","DOIUrl":"10.1016/j.cose.2024.104026","url":null,"abstract":"<div><p>This study was prompted by the scarcity of focused quantitative research on the cybersecurity of SMBs. Our research aimed to understand the factors influencing SMBs' approach to cybersecurity, their level of threat awareness and the importance placed on cybersecurity. It also explored the extent to which NIST CSF practices are implemented by SMBs while also detecting and ranking the prevalent challenges faced by SMBs. Additionally, resources that SMBs turn to for help and guidance were also evaluated. While the survey-based study was on Western Australian SMBs, the results are of more general and wider interest. Our study found the lack of funds to be the biggest hindrance to cybersecurity, along with a lack of knowledge on where to start implementing good security practices. SMBs also lacked familiarity with relevant regulations and frameworks. The study highlights areas for improvement, such as access control mechanisms, individual user accounts, formalised policies and procedures, and dedicated budgets. SMBs heavily rely on Google search for cybersecurity information, emphasising the need for optimised search results from authoritative sources. IT service providers and informal networks also emerge as important sources of cybersecurity guidance, while local universities could assist SMBs but remain underutilised in this regard. Interestingly, factors such as organisational size, industry sector, and revenue level did not significantly impact SMBs' perception of vulnerability to cyber threats. However, further investigation is needed to evaluate the effectiveness of different IT service models for SMBs' cybersecurity needs. Overall, the research provides valuable insights into the specific gaps and challenges faced by SMBs in the cybersecurity domain, as well as their preferred methods of seeking and consuming cybersecurity assistance. The findings can guide the development of targeted strategies and policies to enhance the cybersecurity posture of SMBs.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167404824003316/pdfft?md5=ab6932582bfbe44312d2e544615351c6&pid=1-s2.0-S0167404824003316-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1016/j.cose.2024.104029
Binary vulnerability detection is a significant area of research in computer security. The existing methods for detecting binary vulnerabilities primarily rely on binary code similarity analysis, detecting vulnerabilities by comparing the similarities embedded in binary codes. Recently, Transformer-based models have achieved significant progress in this field, leveraging their advantage in handling sequential data to better understand the semantics of assembly code. However, to prevent the out-of-vocabulary (OOV) problems, assembly code typically needs to be normalized, which would lose some important numerical and jump information. In this paper, we propose HAformer, a Transformer-based model, which semantically fuses hexadecimal machine codes and assembly codes to extract richer semantic information from binary codes. By incorporating the hexadecimal machine code and a newly designed assembly code normalization method, HAformer can alleviate the problem of numerical information loss caused by traditional assembly code normalization, thereby addressing the issue of OOV. Evaluation results demonstrate that our HAformer outperforms the baseline method in the Recall@1 metric by 16.9%, 25.5% and 19.2% in cross-optimization level, cross-compiler and cross-architecture environments, respectively. In real-world vulnerability detection experiments, HAformer exhibits the highest accuracy.
{"title":"HAformer: Semantic fusion of hex machine code and assembly code for cross-architecture binary vulnerability detection","authors":"","doi":"10.1016/j.cose.2024.104029","DOIUrl":"10.1016/j.cose.2024.104029","url":null,"abstract":"<div><p>Binary vulnerability detection is a significant area of research in computer security. The existing methods for detecting binary vulnerabilities primarily rely on binary code similarity analysis, detecting vulnerabilities by comparing the similarities embedded in binary codes. Recently, Transformer-based models have achieved significant progress in this field, leveraging their advantage in handling sequential data to better understand the semantics of assembly code. However, to prevent the out-of-vocabulary (OOV) problems, assembly code typically needs to be normalized, which would lose some important numerical and jump information. In this paper, we propose HAformer, a Transformer-based model, which semantically fuses hexadecimal machine codes and assembly codes to extract richer semantic information from binary codes. By incorporating the hexadecimal machine code and a newly designed assembly code normalization method, HAformer can alleviate the problem of numerical information loss caused by traditional assembly code normalization, thereby addressing the issue of OOV. Evaluation results demonstrate that our HAformer outperforms the baseline method in the Recall@1 metric by 16.9%, 25.5% and 19.2% in cross-optimization level, cross-compiler and cross-architecture environments, respectively. In real-world vulnerability detection experiments, HAformer exhibits the highest accuracy.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}