Enhanced detection of obfuscated malware in memory dumps: a machine learning approach for advanced cybersecurity

IF 3.9 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Cybersecurity Pub Date : 2024-01-25 DOI:10.1186/s42400-024-00205-z
Md. Alamgir Hossain, Md. Saiful Islam
{"title":"Enhanced detection of obfuscated malware in memory dumps: a machine learning approach for advanced cybersecurity","authors":"Md. Alamgir Hossain, Md. Saiful Islam","doi":"10.1186/s42400-024-00205-z","DOIUrl":null,"url":null,"abstract":"<p>In the realm of cybersecurity, the detection and analysis of obfuscated malware remain a critical challenge, especially in the context of memory dumps. This research paper presents a novel machine learning-based framework designed to enhance the detection and analytical capabilities against such elusive threats for binary and multi type’s malware. Our approach leverages a comprehensive dataset comprising benign and malicious memory dumps, encompassing a wide array of obfuscated malware types including Spyware, Ransomware, and Trojan Horses with their sub-categories. We begin by employing rigorous data preprocessing methods, including the normalization of memory dumps and encoding of categorical data. To tackle the issue of class imbalance, a Synthetic Minority Over-sampling Technique is utilized, ensuring a balanced representation of various malware types. Feature selection is meticulously conducted through Chi-Square tests, mutual information, and correlation analyses, refining the model’s focus on the most indicative attributes of obfuscated malware. The heart of our framework lies in the deployment of an Ensemble-based Classifier, chosen for its robustness and effectiveness in handling complex data structures. The model’s performance is rigorously evaluated using a suite of metrics, including accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC) with other evaluation metrics to assess the model’s efficiency. The proposed model demonstrates a detection accuracy exceeding 99% across all cases, surpassing the performance of all existing models in the realm of malware detection.</p>","PeriodicalId":36402,"journal":{"name":"Cybersecurity","volume":"16 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s42400-024-00205-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In the realm of cybersecurity, the detection and analysis of obfuscated malware remain a critical challenge, especially in the context of memory dumps. This research paper presents a novel machine learning-based framework designed to enhance the detection and analytical capabilities against such elusive threats for binary and multi type’s malware. Our approach leverages a comprehensive dataset comprising benign and malicious memory dumps, encompassing a wide array of obfuscated malware types including Spyware, Ransomware, and Trojan Horses with their sub-categories. We begin by employing rigorous data preprocessing methods, including the normalization of memory dumps and encoding of categorical data. To tackle the issue of class imbalance, a Synthetic Minority Over-sampling Technique is utilized, ensuring a balanced representation of various malware types. Feature selection is meticulously conducted through Chi-Square tests, mutual information, and correlation analyses, refining the model’s focus on the most indicative attributes of obfuscated malware. The heart of our framework lies in the deployment of an Ensemble-based Classifier, chosen for its robustness and effectiveness in handling complex data structures. The model’s performance is rigorously evaluated using a suite of metrics, including accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC) with other evaluation metrics to assess the model’s efficiency. The proposed model demonstrates a detection accuracy exceeding 99% across all cases, surpassing the performance of all existing models in the realm of malware detection.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
增强对内存转储中混淆恶意软件的检测:先进网络安全的机器学习方法
在网络安全领域,混淆恶意软件的检测和分析仍然是一项严峻的挑战,尤其是在内存转储的情况下。本研究论文介绍了一种新颖的基于机器学习的框架,旨在增强对二进制和多类型恶意软件的检测和分析能力,以应对此类难以捉摸的威胁。我们的方法利用了由良性和恶意内存转储组成的综合数据集,其中包括间谍软件、勒索软件和木马及其子类等多种混淆恶意软件类型。我们首先采用了严格的数据预处理方法,包括内存转储的规范化和分类数据的编码。为了解决类不平衡的问题,我们采用了合成少数群体过度采样技术,以确保各种恶意软件类型的均衡代表性。通过秩方检验、互信息和相关性分析,对特征进行了细致的选择,从而使模型更加专注于混淆恶意软件最具指示性的属性。我们框架的核心在于部署一个基于集合的分类器,该分类器因其在处理复杂数据结构时的鲁棒性和有效性而被选中。我们使用一系列指标对模型的性能进行了严格评估,包括准确度、精确度、召回率、F1 分数、ROC 曲线下面积(AUC)以及其他评估指标,以评估模型的效率。所提出的模型在所有情况下的检测准确率都超过了 99%,超越了恶意软件检测领域所有现有模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cybersecurity
Cybersecurity Computer Science-Information Systems
CiteScore
7.30
自引率
0.00%
发文量
77
审稿时长
9 weeks
期刊最新文献
Cloud EMRs auditing with decentralized (t, n)-threshold ownership transfer SIFT: Sifting file types—application of explainable artificial intelligence in cyber forensics Modelling user notification scenarios in privacy policies FLSec-RPL: a fuzzy logic-based intrusion detection scheme for securing RPL-based IoT networks against DIO neighbor suppression attacks New partial key exposure attacks on RSA with additive exponent blinding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1