An Enhanced K-Means Clustering Algorithm for Phishing Attack Detections

IF 2.6 3区工程技术 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Electronics Pub Date : 2024-09-16 DOI:10.3390/electronics13183677

Abdallah Al-Sabbagh, Khalil Hamze, Samiya Khan, Mahmoud Elkhodr

{"title":"An Enhanced K-Means Clustering Algorithm for Phishing Attack Detections","authors":"Abdallah Al-Sabbagh, Khalil Hamze, Samiya Khan, Mahmoud Elkhodr","doi":"10.3390/electronics13183677","DOIUrl":null,"url":null,"abstract":"Phishing attacks continue to pose a significant threat to cybersecurity, employing increasingly sophisticated techniques to deceive victims into revealing sensitive information or downloading malware. This paper presents a comprehensive study on the application of Machine Learning (ML) techniques for identifying phishing websites, with a focus on enhancing detection accuracy and efficiency. We propose an approach that integrates the CfsSubsetEval attribute evaluator with the K-Means Clustering algorithm to improve phishing detection capabilities. Our method was evaluated using datasets of varying sizes (2000, 7000, and 10,000 samples) from a publicly available repository. Simulation results demonstrate that our approach achieves an accuracy of 89.2% on the 2000-sample dataset, outperforming the traditional kernel K-Means algorithm, which achieved an accuracy of 51.5%. Further analysis using precision, recall, and F1-score metrics corroborates the effectiveness of our method. We also discuss the scalability and real-world applicability of our approach, addressing limitations and proposing future research directions. This study contributes to the ongoing efforts to develop robust, efficient, and adaptable phishing detection systems in the face of evolving cyber threats.","PeriodicalId":11646,"journal":{"name":"Electronics","volume":"374 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/electronics13183677","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Phishing attacks continue to pose a significant threat to cybersecurity, employing increasingly sophisticated techniques to deceive victims into revealing sensitive information or downloading malware. This paper presents a comprehensive study on the application of Machine Learning (ML) techniques for identifying phishing websites, with a focus on enhancing detection accuracy and efficiency. We propose an approach that integrates the CfsSubsetEval attribute evaluator with the K-Means Clustering algorithm to improve phishing detection capabilities. Our method was evaluated using datasets of varying sizes (2000, 7000, and 10,000 samples) from a publicly available repository. Simulation results demonstrate that our approach achieves an accuracy of 89.2% on the 2000-sample dataset, outperforming the traditional kernel K-Means algorithm, which achieved an accuracy of 51.5%. Further analysis using precision, recall, and F1-score metrics corroborates the effectiveness of our method. We also discuss the scalability and real-world applicability of our approach, addressing limitations and proposing future research directions. This study contributes to the ongoing efforts to develop robust, efficient, and adaptable phishing detection systems in the face of evolving cyber threats.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于网络钓鱼攻击检测的增强型 K-Means 聚类算法

网络钓鱼攻击继续对网络安全构成重大威胁，它利用日益复杂的技术欺骗受害者，使其泄露敏感信息或下载恶意软件。本文全面研究了机器学习（ML）技术在识别网络钓鱼网站中的应用，重点是提高检测的准确性和效率。我们提出了一种将 CfsSubsetEval 属性评估器与 K-Means 聚类算法相结合的方法，以提高网络钓鱼的检测能力。我们使用公开资料库中不同规模（2000、7000 和 10,000 个样本）的数据集对我们的方法进行了评估。模拟结果表明，在 2000 个样本的数据集上，我们的方法达到了 89.2% 的准确率，超过了传统内核 K-Means 算法 51.5% 的准确率。使用精确度、召回率和 F1 分数指标进行的进一步分析证实了我们方法的有效性。我们还讨论了我们方法的可扩展性和实际应用性，解决了局限性问题，并提出了未来的研究方向。面对不断发展的网络威胁，我们正在努力开发稳健、高效和适应性强的网络钓鱼检测系统，本研究为这一努力做出了贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Electronics Computer Science-Computer Networks and Communications

CiteScore

1.10

自引率

10.30%

发文量

3515

审稿时长

16.71 days

期刊介绍： Electronics (ISSN 2079-9292; CODEN: ELECGJ) is an international, open access journal on the science of electronics and its applications published quarterly online by MDPI.