{"title":"Spam host classification using PSO-SVM","authors":"A. Enache, V. Sgârciu","doi":"10.1109/AQTR.2014.6857840","DOIUrl":null,"url":null,"abstract":"Search engines have become a de facto place to start information acquisition on the Internet. Sabotaging the quality of the results retrieved by search engines can lead users to doubt the search engine provider. Spam websites can serve as means of phishing. This paper shows a spam host detection approach that uses support vector machines(SVM) for classification. We create a parallel version of standard Particle Swarm Optimization(PSO) to determine free parameters of the SVM classifier and apply our proposed model to a content web spamming dataset, WEBSPAM-UK2011. Our implementation of the parallel PSO is constructed on a pool of threads and each thread executes tasks associated to a particle from the swarm. Experiments showed that our proposed model can achieve a higher accuracy than regular SVM and outperforms other classifiers (C4.5, Naive Bayes). Furthermore, parallel version of standard Particle Swam Optimization(PSO) can efficiently select parameters for SVM.","PeriodicalId":297141,"journal":{"name":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AQTR.2014.6857840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Search engines have become a de facto place to start information acquisition on the Internet. Sabotaging the quality of the results retrieved by search engines can lead users to doubt the search engine provider. Spam websites can serve as means of phishing. This paper shows a spam host detection approach that uses support vector machines(SVM) for classification. We create a parallel version of standard Particle Swarm Optimization(PSO) to determine free parameters of the SVM classifier and apply our proposed model to a content web spamming dataset, WEBSPAM-UK2011. Our implementation of the parallel PSO is constructed on a pool of threads and each thread executes tasks associated to a particle from the swarm. Experiments showed that our proposed model can achieve a higher accuracy than regular SVM and outperforms other classifiers (C4.5, Naive Bayes). Furthermore, parallel version of standard Particle Swam Optimization(PSO) can efficiently select parameters for SVM.