{"title":"Malicious URL Detection using NLP, Machine Learning and FLASK","authors":"A. Lakshmanarao, M. Babu, M. M. Bala Krishna","doi":"10.1109/ICSES52305.2021.9633889","DOIUrl":null,"url":null,"abstract":"A URL created to attack with spam or fraud is known as a malicious/phishing URL. Viruses are downloaded into the system if the user clicks such URLs. Malicious URLs can lead to phishing and spam. With phishing, user credentials, valuable information is compromised. So, it is important to identify safe links and malicious links. Cyber-attacks are attempting with the origin of malicious URLs Phishers are manipulating their cyber attacking techniques rapidly. Machine Learning is a field of study where a system learns from previous experience and reacts to future events. Machine Learning methods are useful for resolving security applications. In this paper, authors proposed machine learning oriented solution for detecting malicious websites. For experiments, a Kaggle dataset with a large number of URLs (above 5, 00000 URLs) is used. We applied three techniques for text feature extraction count vectorizer, hashing vectorizer-IDF vectorizer, and later build a phishing website detection model with four ML classifiers Logistic Regression, K-NN, Decision Tree, Random Forest. The ML model with hash vectorizer and random forest achieved 97.5% accuracy. We also created a web app using Flask for detecting the entered URL is malicious or not.","PeriodicalId":6777,"journal":{"name":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","volume":"7 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSES52305.2021.9633889","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
A URL created to attack with spam or fraud is known as a malicious/phishing URL. Viruses are downloaded into the system if the user clicks such URLs. Malicious URLs can lead to phishing and spam. With phishing, user credentials, valuable information is compromised. So, it is important to identify safe links and malicious links. Cyber-attacks are attempting with the origin of malicious URLs Phishers are manipulating their cyber attacking techniques rapidly. Machine Learning is a field of study where a system learns from previous experience and reacts to future events. Machine Learning methods are useful for resolving security applications. In this paper, authors proposed machine learning oriented solution for detecting malicious websites. For experiments, a Kaggle dataset with a large number of URLs (above 5, 00000 URLs) is used. We applied three techniques for text feature extraction count vectorizer, hashing vectorizer-IDF vectorizer, and later build a phishing website detection model with four ML classifiers Logistic Regression, K-NN, Decision Tree, Random Forest. The ML model with hash vectorizer and random forest achieved 97.5% accuracy. We also created a web app using Flask for detecting the entered URL is malicious or not.