{"title":"利用贝叶斯神经网络增强恶意软件预测和遏制能力","authors":"Zahra Jamadi;Amir G. Aghdam","doi":"10.1109/JRFID.2024.3410881","DOIUrl":null,"url":null,"abstract":"In this paper, we present an integrated framework leveraging natural language processing (NLP) techniques and machine learning (ML) algorithms to detect malware at its early stage and predict its upcoming actions. We analyze application programming interface (API) call sequences in the same way as natural language inputs. Specifically, the proposed model employs Bi-LSTM neural networks and Bayesian neural networks (BNN) for this analysis. In the first part, a Bagging-XGBoost algorithm interprets consecutive API calls as 2-gram and 3-gram strings for early-stage malware detection and feature importance analysis. Additionally, a Bi-LSTM predicts the upcoming actions of an active malware by estimating the next API call in a sequence. Two separate Bayesian Bi-LSTMs are then developed in the second part to complement the above analysis. The first architecture is for early-stage malware detection, and the other is to predict the following action of active malware. The BNN not only predicts future malware actions but also assesses the uncertainty of each prediction. It enhances the process by providing the second and third most probable predictions, increasing system reliability and effectiveness. Our unified framework demonstrates efficiency in malware detection and action prediction, marking a significant advancement in countering malware threats. The Bayesian Bi-LSTM developed for predicting the next API call has an average accuracy of 89.53%. Additionally, the accuracy of the framework for malware detection at the early stage is 96.44%, demonstrating the superior performance of the proposed framework.","PeriodicalId":73291,"journal":{"name":"IEEE journal of radio frequency identification","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced Malware Prediction and Containment Using Bayesian Neural Networks\",\"authors\":\"Zahra Jamadi;Amir G. Aghdam\",\"doi\":\"10.1109/JRFID.2024.3410881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present an integrated framework leveraging natural language processing (NLP) techniques and machine learning (ML) algorithms to detect malware at its early stage and predict its upcoming actions. We analyze application programming interface (API) call sequences in the same way as natural language inputs. Specifically, the proposed model employs Bi-LSTM neural networks and Bayesian neural networks (BNN) for this analysis. In the first part, a Bagging-XGBoost algorithm interprets consecutive API calls as 2-gram and 3-gram strings for early-stage malware detection and feature importance analysis. Additionally, a Bi-LSTM predicts the upcoming actions of an active malware by estimating the next API call in a sequence. Two separate Bayesian Bi-LSTMs are then developed in the second part to complement the above analysis. The first architecture is for early-stage malware detection, and the other is to predict the following action of active malware. The BNN not only predicts future malware actions but also assesses the uncertainty of each prediction. It enhances the process by providing the second and third most probable predictions, increasing system reliability and effectiveness. Our unified framework demonstrates efficiency in malware detection and action prediction, marking a significant advancement in countering malware threats. The Bayesian Bi-LSTM developed for predicting the next API call has an average accuracy of 89.53%. Additionally, the accuracy of the framework for malware detection at the early stage is 96.44%, demonstrating the superior performance of the proposed framework.\",\"PeriodicalId\":73291,\"journal\":{\"name\":\"IEEE journal of radio frequency identification\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal of radio frequency identification\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10550924/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal of radio frequency identification","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10550924/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
在本文中,我们提出了一个综合框架,利用自然语言处理(NLP)技术和机器学习(ML)算法在早期阶段检测恶意软件并预测其即将采取的行动。我们分析应用程序编程接口(API)调用序列的方法与分析自然语言输入的方法相同。具体来说,所提议的模型采用 Bi-LSTM 神经网络和贝叶斯神经网络 (BNN) 进行分析。在第一部分,Bagging-XGBoost 算法将连续的 API 调用解释为 2-gram 和 3-gram 字符串,用于早期恶意软件检测和特征重要性分析。此外,Bi-LSTM 通过估计序列中的下一个 API 调用,预测活动恶意软件即将采取的行动。第二部分开发了两个独立的贝叶斯 Bi-LSTM 来补充上述分析。第一个架构用于早期恶意软件检测,另一个架构用于预测活跃恶意软件的后续行动。BNN 不仅能预测恶意软件的未来行动,还能评估每次预测的不确定性。它通过提供第二和第三种最有可能的预测来增强这一过程,从而提高系统的可靠性和有效性。我们的统一框架提高了恶意软件检测和行动预测的效率,标志着在应对恶意软件威胁方面取得了重大进展。为预测下一次 API 调用而开发的贝叶斯 Bi-LSTM 的平均准确率为 89.53%。此外,该框架在早期阶段检测恶意软件的准确率为 96.44%,证明了所提出框架的卓越性能。
Enhanced Malware Prediction and Containment Using Bayesian Neural Networks
In this paper, we present an integrated framework leveraging natural language processing (NLP) techniques and machine learning (ML) algorithms to detect malware at its early stage and predict its upcoming actions. We analyze application programming interface (API) call sequences in the same way as natural language inputs. Specifically, the proposed model employs Bi-LSTM neural networks and Bayesian neural networks (BNN) for this analysis. In the first part, a Bagging-XGBoost algorithm interprets consecutive API calls as 2-gram and 3-gram strings for early-stage malware detection and feature importance analysis. Additionally, a Bi-LSTM predicts the upcoming actions of an active malware by estimating the next API call in a sequence. Two separate Bayesian Bi-LSTMs are then developed in the second part to complement the above analysis. The first architecture is for early-stage malware detection, and the other is to predict the following action of active malware. The BNN not only predicts future malware actions but also assesses the uncertainty of each prediction. It enhances the process by providing the second and third most probable predictions, increasing system reliability and effectiveness. Our unified framework demonstrates efficiency in malware detection and action prediction, marking a significant advancement in countering malware threats. The Bayesian Bi-LSTM developed for predicting the next API call has an average accuracy of 89.53%. Additionally, the accuracy of the framework for malware detection at the early stage is 96.44%, demonstrating the superior performance of the proposed framework.