{"title":"On Effectiveness of Source Code and SSL Based Features for Phishing Website Detection","authors":"Roopak S, Athira P. Vijayaraghavan, Tony Thomas","doi":"10.1109/ICATIECE45860.2019.9063824","DOIUrl":null,"url":null,"abstract":"Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.","PeriodicalId":106496,"journal":{"name":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICATIECE45860.2019.9063824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Phishing is a social engineering method to steal user credentials through data entry forms from malicious websites. Currently available anti-malware softwares can only detect black listed phishing websites. Similarity based detection methods such as visual similarity can be easily evaded by making some changes in the textual and visual contents of a phishing site. The phishing behavior of a web page can be identified from its URL, domain and source code based features. However, URL and domain based features can be easily defeated by using black hat SEO techniques. In this paper, we extract the relevant rules based on webpage source code and Secure Socket Layering (SSL) based features from a training dataset using Repeated Incremental Pruning to Produce Error Reduction (RIPPER) algorithm. Further, we check for the presence of these rules in a test dataset. Our implementation results show that the webpage source code based rules can identify phishing websites with an accuracy of 0.92.