Analysis of Feature Selection and Phishing Website Classification Using Machine Learning

Shatha Ghareeb, M. Mahyoub, J. Mustafina
{"title":"Analysis of Feature Selection and Phishing Website Classification Using Machine Learning","authors":"Shatha Ghareeb, M. Mahyoub, J. Mustafina","doi":"10.1109/DeSE58274.2023.10099697","DOIUrl":null,"url":null,"abstract":"Phishing website detection is the task of classifying websites as phishing or legitimate based on URL parameters and certain behaviour of the site. In today's world, dependency on websites has become inevitable. With the increase in website users population and the rise of the internet, cyber-attacks have become a common thing. Attackers across the globe target innocent users to steal their personal classified information such as login credentials, credit or debit card information, which may lead to serious monetary and identity damage for the users. One of the main challenges with this problem is the constant change in phishing URLs. Due to this, there is a constant need to update the detection mechanism, which may be extinct in a short period of time. Most of the current phishing detection tools utilise the black box method, where phishing URLs are stored and queried for verification. This may not be an efficient way due to the constant change in the URLs. In this study, a machine learning based approach is proposed along with a feature selection method to select the right set of features that may contribute to higher detection accuracy. The proposed model is also aimed at being simple, faster, and interpretable. Efficiency, accuracy, and model execution time will be evaluated against the final model.","PeriodicalId":346847,"journal":{"name":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DeSE58274.2023.10099697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Phishing website detection is the task of classifying websites as phishing or legitimate based on URL parameters and certain behaviour of the site. In today's world, dependency on websites has become inevitable. With the increase in website users population and the rise of the internet, cyber-attacks have become a common thing. Attackers across the globe target innocent users to steal their personal classified information such as login credentials, credit or debit card information, which may lead to serious monetary and identity damage for the users. One of the main challenges with this problem is the constant change in phishing URLs. Due to this, there is a constant need to update the detection mechanism, which may be extinct in a short period of time. Most of the current phishing detection tools utilise the black box method, where phishing URLs are stored and queried for verification. This may not be an efficient way due to the constant change in the URLs. In this study, a machine learning based approach is proposed along with a feature selection method to select the right set of features that may contribute to higher detection accuracy. The proposed model is also aimed at being simple, faster, and interpretable. Efficiency, accuracy, and model execution time will be evaluated against the final model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于机器学习的特征选择与钓鱼网站分类分析
网络钓鱼网站检测是根据网站的URL参数和网站的某些行为,对网站进行网络钓鱼或合法的分类。在当今世界,对网站的依赖已成为不可避免的。随着网站用户数量的增加和互联网的兴起,网络攻击已经成为一件司空见惯的事情。全球范围内的攻击者以无辜用户为目标,窃取他们的个人机密信息,如登录凭据、信用卡或借记卡信息,这可能会给用户带来严重的金钱和身份损失。这个问题的主要挑战之一是网络钓鱼url的不断变化。因此,检测机制需要不断更新,可能会在短时间内消失。目前大多数网络钓鱼检测工具使用黑盒方法,将网络钓鱼url存储并查询以进行验证。由于url的不断变化,这可能不是一种有效的方法。在本研究中,提出了一种基于机器学习的方法以及一种特征选择方法,以选择可能有助于提高检测精度的正确特征集。提出的模型还旨在简单、快速和可解释。效率、准确性和模型执行时间将根据最终模型进行评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Using Simulation for Investigating Emergency Traffic Situations Real- Time Healthcare Monitoring and Treatment System Based Microcontroller with IoT Automated Face Mask Detection using Artificial Intelligence and Video Surveillance Management Improvement of the Personnel Delivery System in the Mining Complex using Simulation Models An Exploratory Study on the Impact of Hosting Blockchain Applications in Cloud Infrastructures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1