{"title":"Analysis of Feature Selection and Phishing Website Classification Using Machine Learning","authors":"Shatha Ghareeb, M. Mahyoub, J. Mustafina","doi":"10.1109/DeSE58274.2023.10099697","DOIUrl":null,"url":null,"abstract":"Phishing website detection is the task of classifying websites as phishing or legitimate based on URL parameters and certain behaviour of the site. In today's world, dependency on websites has become inevitable. With the increase in website users population and the rise of the internet, cyber-attacks have become a common thing. Attackers across the globe target innocent users to steal their personal classified information such as login credentials, credit or debit card information, which may lead to serious monetary and identity damage for the users. One of the main challenges with this problem is the constant change in phishing URLs. Due to this, there is a constant need to update the detection mechanism, which may be extinct in a short period of time. Most of the current phishing detection tools utilise the black box method, where phishing URLs are stored and queried for verification. This may not be an efficient way due to the constant change in the URLs. In this study, a machine learning based approach is proposed along with a feature selection method to select the right set of features that may contribute to higher detection accuracy. The proposed model is also aimed at being simple, faster, and interpretable. Efficiency, accuracy, and model execution time will be evaluated against the final model.","PeriodicalId":346847,"journal":{"name":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DeSE58274.2023.10099697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Phishing website detection is the task of classifying websites as phishing or legitimate based on URL parameters and certain behaviour of the site. In today's world, dependency on websites has become inevitable. With the increase in website users population and the rise of the internet, cyber-attacks have become a common thing. Attackers across the globe target innocent users to steal their personal classified information such as login credentials, credit or debit card information, which may lead to serious monetary and identity damage for the users. One of the main challenges with this problem is the constant change in phishing URLs. Due to this, there is a constant need to update the detection mechanism, which may be extinct in a short period of time. Most of the current phishing detection tools utilise the black box method, where phishing URLs are stored and queried for verification. This may not be an efficient way due to the constant change in the URLs. In this study, a machine learning based approach is proposed along with a feature selection method to select the right set of features that may contribute to higher detection accuracy. The proposed model is also aimed at being simple, faster, and interpretable. Efficiency, accuracy, and model execution time will be evaluated against the final model.