使用基于机器学习的方法加强法证分析

Samira Benkerroum, Khalid Chougdali
{"title":"使用基于机器学习的方法加强法证分析","authors":"Samira Benkerroum, Khalid Chougdali","doi":"10.1109/CommNet60167.2023.10365260","DOIUrl":null,"url":null,"abstract":"In recent years, computers or digital devices contribute to the global spread of cyber threats and cyber crimes. These cyberattacks leave some artefacts on the storage of the target device, for this reason they require special treatment, and which will have to be the subject of various investigations in order to study its behavior and analyze and prevent it so that this never happen again.Despite the continued development of digital forensic investigations for the recovery of evidence whether volatile or non-volatile, manual investigations are both time-intensive and laborious. The proposed solution is to use a method to automate manual forensic investigation tasks (forensic analysis) to reduce human effort and improve time efficiency.This paper presents a summary of the digital forensic investigation process, we discuss existing ML solutions to automate the analysis process.Finally, the paper proposes an approach based on machine learning where the binary classification was performed using the algorithms K-Nearest Neighbors, Naive Bayes, Random Forest, Support Vector Machine, Decision Tree, Logistic Regression, Gradient Boosted Tree, Multi-Layer Perceptron, using CIC-MalMem-2022 dataset to identify malware.The algorithms’ respective performances were contrasted. The performance metrics Precision, F1-score, Accuracy, Recall, and Area Under the Curve were used to assess the outcomes. Consequently, the Random Forest and Gradient Boosted Tree algorithms demonstrated superior performance, achieving a remarkable accuracy level of 99.98% in the detection of malware through memory scans. The Logistic Regression algorithm exhibited the least favorable performance in analyzing malware using memory data, achieving an accuracy rate of 95.75%. According to the results obtained, many algorithms used have obtained very satisfactory results.","PeriodicalId":505542,"journal":{"name":"2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)","volume":"19 4","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Forensic Analysis Using a Machine Learning-based Approach\",\"authors\":\"Samira Benkerroum, Khalid Chougdali\",\"doi\":\"10.1109/CommNet60167.2023.10365260\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, computers or digital devices contribute to the global spread of cyber threats and cyber crimes. These cyberattacks leave some artefacts on the storage of the target device, for this reason they require special treatment, and which will have to be the subject of various investigations in order to study its behavior and analyze and prevent it so that this never happen again.Despite the continued development of digital forensic investigations for the recovery of evidence whether volatile or non-volatile, manual investigations are both time-intensive and laborious. The proposed solution is to use a method to automate manual forensic investigation tasks (forensic analysis) to reduce human effort and improve time efficiency.This paper presents a summary of the digital forensic investigation process, we discuss existing ML solutions to automate the analysis process.Finally, the paper proposes an approach based on machine learning where the binary classification was performed using the algorithms K-Nearest Neighbors, Naive Bayes, Random Forest, Support Vector Machine, Decision Tree, Logistic Regression, Gradient Boosted Tree, Multi-Layer Perceptron, using CIC-MalMem-2022 dataset to identify malware.The algorithms’ respective performances were contrasted. The performance metrics Precision, F1-score, Accuracy, Recall, and Area Under the Curve were used to assess the outcomes. Consequently, the Random Forest and Gradient Boosted Tree algorithms demonstrated superior performance, achieving a remarkable accuracy level of 99.98% in the detection of malware through memory scans. The Logistic Regression algorithm exhibited the least favorable performance in analyzing malware using memory data, achieving an accuracy rate of 95.75%. According to the results obtained, many algorithms used have obtained very satisfactory results.\",\"PeriodicalId\":505542,\"journal\":{\"name\":\"2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)\",\"volume\":\"19 4\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CommNet60167.2023.10365260\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CommNet60167.2023.10365260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,计算机或数字设备助长了网络威胁和网络犯罪在全球的蔓延。这些网络攻击会在目标设备的存储设备上留下一些人工痕迹,因此需要对其进行特殊处理,并对其进行各种调查,以研究其行为,分析并预防其发生,从而避免此类事件再次发生。尽管数字取证调查在恢复易失性或非易失性证据方面不断发展,但人工调查既费时又费力。本文概述了数字取证调查过程,讨论了现有的 ML 解决方案,以实现分析过程的自动化。最后,本文提出了一种基于机器学习的方法,即使用 K-Nearest Neighbors、Naive Bayes、Random Forest、Support Vector Machine、Decision Tree、Logistic Regression、Gradient Boosted Tree、Multi-Layer Perceptron 等算法进行二元分类,使用 CIC-MalMem-2022 数据集来识别恶意软件。使用精度、F1 分数、准确率、召回率和曲线下面积等性能指标来评估结果。结果显示,随机森林算法和梯度提升树算法表现优异,在通过内存扫描检测恶意软件方面达到了 99.98% 的出色准确率水平。逻辑回归算法在利用内存数据分析恶意软件方面表现最差,准确率仅为 95.75%。从获得的结果来看,所使用的许多算法都取得了非常令人满意的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing Forensic Analysis Using a Machine Learning-based Approach
In recent years, computers or digital devices contribute to the global spread of cyber threats and cyber crimes. These cyberattacks leave some artefacts on the storage of the target device, for this reason they require special treatment, and which will have to be the subject of various investigations in order to study its behavior and analyze and prevent it so that this never happen again.Despite the continued development of digital forensic investigations for the recovery of evidence whether volatile or non-volatile, manual investigations are both time-intensive and laborious. The proposed solution is to use a method to automate manual forensic investigation tasks (forensic analysis) to reduce human effort and improve time efficiency.This paper presents a summary of the digital forensic investigation process, we discuss existing ML solutions to automate the analysis process.Finally, the paper proposes an approach based on machine learning where the binary classification was performed using the algorithms K-Nearest Neighbors, Naive Bayes, Random Forest, Support Vector Machine, Decision Tree, Logistic Regression, Gradient Boosted Tree, Multi-Layer Perceptron, using CIC-MalMem-2022 dataset to identify malware.The algorithms’ respective performances were contrasted. The performance metrics Precision, F1-score, Accuracy, Recall, and Area Under the Curve were used to assess the outcomes. Consequently, the Random Forest and Gradient Boosted Tree algorithms demonstrated superior performance, achieving a remarkable accuracy level of 99.98% in the detection of malware through memory scans. The Logistic Regression algorithm exhibited the least favorable performance in analyzing malware using memory data, achieving an accuracy rate of 95.75%. According to the results obtained, many algorithms used have obtained very satisfactory results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Quantum codes over Fq from α+βu+γv+δuv+ηu2+θv2+λu2v+μuv2+νu2v2- constacyclic codes A New IoT Power-Limited Wireless Sensor Networks Routing Protocol Utilizing Computational Intelligence CommNet 2023 Cover Page Efficient Brain Tumor Classification on Resource-Constrained Devices Using Stacking Ensemble and RadImageNet Pretrained Models David and Goliath: Asymmetric Advantage in MIoT
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1