{"title":"Identification and Filtering of Web Spams Using a Machine Learning Method","authors":"Dawei Zhang, Yanyu Liu","doi":"10.1142/s1469026822500237","DOIUrl":null,"url":null,"abstract":"In order to enhance the filtering of spam on the Internet and improve the experience of Internet users, this paper proposed to convert the email text into vector features using the vector space model, constructed a two-dimensional matrix, and used a convolutional neural network (CNN) to identify spam on the Internet. The CNN was compared with other two classifiers, support vector machine (SVM), and backward-propagation neural network (BPNN), in simulation experiments. The final results showed that the spam recognition algorithm with CNN as the classifier had better recognition performance than the algorithms with SVM and BPNN classifiers and was also more advantageous in terms of recognition cost and time for spam; in addition, the CNN had the best recognition performance when the number of extracted features was 15.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Intell. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1469026822500237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In order to enhance the filtering of spam on the Internet and improve the experience of Internet users, this paper proposed to convert the email text into vector features using the vector space model, constructed a two-dimensional matrix, and used a convolutional neural network (CNN) to identify spam on the Internet. The CNN was compared with other two classifiers, support vector machine (SVM), and backward-propagation neural network (BPNN), in simulation experiments. The final results showed that the spam recognition algorithm with CNN as the classifier had better recognition performance than the algorithms with SVM and BPNN classifiers and was also more advantageous in terms of recognition cost and time for spam; in addition, the CNN had the best recognition performance when the number of extracted features was 15.