{"title":"Spam Detection Based on Nearest Community Classifier","authors":"Michal Prilepok, M. Kudelka","doi":"10.1109/INCoS.2015.75","DOIUrl":null,"url":null,"abstract":"Undesirable emails (spam) are increasingly becoming a big problem nowadays, not only for users, but also for Internet service providers. Therefore, the design of new algorithms detecting the spam is currently one of the research hot-topics. We define two requirements and use them simultaneously. The first requirement is a low rate of falsely detected emails which has an impact on the algorithm performance. The second requirement is a fast detection of spams. It minimizes the delay in receiving emails. In this paper, we focus our effort on the first requirement. To solve this problem we applied network community analysis. The approach is to find communities - groups of same emails. In this paper, we present a new nearest community classifier and apply it in the field of spam detection. The obtained results are very close to Bayesian Spam Filter. We achieved 93.78% accuracy. The algorithm can detect 80.72% of spam emails and 98.01% non-spam emails.","PeriodicalId":345650,"journal":{"name":"2015 International Conference on Intelligent Networking and Collaborative Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Intelligent Networking and Collaborative Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCoS.2015.75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Undesirable emails (spam) are increasingly becoming a big problem nowadays, not only for users, but also for Internet service providers. Therefore, the design of new algorithms detecting the spam is currently one of the research hot-topics. We define two requirements and use them simultaneously. The first requirement is a low rate of falsely detected emails which has an impact on the algorithm performance. The second requirement is a fast detection of spams. It minimizes the delay in receiving emails. In this paper, we focus our effort on the first requirement. To solve this problem we applied network community analysis. The approach is to find communities - groups of same emails. In this paper, we present a new nearest community classifier and apply it in the field of spam detection. The obtained results are very close to Bayesian Spam Filter. We achieved 93.78% accuracy. The algorithm can detect 80.72% of spam emails and 98.01% non-spam emails.