{"title":"基于机器学习的轻量级网络钓鱼网站检测算法","authors":"Chenyu Gu","doi":"10.1109/CONF-SPML54095.2021.00054","DOIUrl":null,"url":null,"abstract":"With the rapid development of the Internet, phishing websites now show the characteristics of short life cycle and low construction cost, which leads to a large amount of data brought by the detection of phishing websites for URL (uniform resource locator). It will also lead to increased retrieval time and decreased detection speed. In order to deal with diverse, complex and hidden phishing websites, this paper proposes a lightweight framework for detecting phishing websites. We first choose the faster Minhash signature to match URLs. On one hand, similarity detection is employed if the websites is suspicious. On the other hand, based on machine learning, the phishing websites can be finally determined by intention detection without similar sites.","PeriodicalId":415094,"journal":{"name":"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Lightweight Phishing Website Detection Algorithm by Machine Learning\",\"authors\":\"Chenyu Gu\",\"doi\":\"10.1109/CONF-SPML54095.2021.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of the Internet, phishing websites now show the characteristics of short life cycle and low construction cost, which leads to a large amount of data brought by the detection of phishing websites for URL (uniform resource locator). It will also lead to increased retrieval time and decreased detection speed. In order to deal with diverse, complex and hidden phishing websites, this paper proposes a lightweight framework for detecting phishing websites. We first choose the faster Minhash signature to match URLs. On one hand, similarity detection is employed if the websites is suspicious. On the other hand, based on machine learning, the phishing websites can be finally determined by intention detection without similar sites.\",\"PeriodicalId\":415094,\"journal\":{\"name\":\"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)\",\"volume\":\"129 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CONF-SPML54095.2021.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Signal Processing and Machine Learning (CONF-SPML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONF-SPML54095.2021.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Lightweight Phishing Website Detection Algorithm by Machine Learning
With the rapid development of the Internet, phishing websites now show the characteristics of short life cycle and low construction cost, which leads to a large amount of data brought by the detection of phishing websites for URL (uniform resource locator). It will also lead to increased retrieval time and decreased detection speed. In order to deal with diverse, complex and hidden phishing websites, this paper proposes a lightweight framework for detecting phishing websites. We first choose the faster Minhash signature to match URLs. On one hand, similarity detection is employed if the websites is suspicious. On the other hand, based on machine learning, the phishing websites can be finally determined by intention detection without similar sites.