{"title":"利用机器学习算法检测垃圾邮件的比较结果","authors":"Rodica Paula Cota, Daniel Zinca","doi":"10.1109/comm54429.2022.9817305","DOIUrl":null,"url":null,"abstract":"Among the problems caused by spam email are loss of productivity and increase in network resources consumption. Sometimes spam email contain malware as attachments or include links for phishing websites, leading to theft and loss of data. Many email servers are filtering spam but the process becomes increasingly difficult as spammers try to create messages that look similar to normal email. In this paper we implemented five Machine Learning Algorithms in the Python language using the scikit-learn library and we compared their performance against two publicly available spam email corpuses. The discussed algorithms are: Support Vector Machine, Random Forest, Logistic Regression, Multinomial Naive Bayes and Gaussian Naive Bayes.","PeriodicalId":118077,"journal":{"name":"2022 14th International Conference on Communications (COMM)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative Results of Spam Email Detection Using Machine Learning Algorithms\",\"authors\":\"Rodica Paula Cota, Daniel Zinca\",\"doi\":\"10.1109/comm54429.2022.9817305\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Among the problems caused by spam email are loss of productivity and increase in network resources consumption. Sometimes spam email contain malware as attachments or include links for phishing websites, leading to theft and loss of data. Many email servers are filtering spam but the process becomes increasingly difficult as spammers try to create messages that look similar to normal email. In this paper we implemented five Machine Learning Algorithms in the Python language using the scikit-learn library and we compared their performance against two publicly available spam email corpuses. The discussed algorithms are: Support Vector Machine, Random Forest, Logistic Regression, Multinomial Naive Bayes and Gaussian Naive Bayes.\",\"PeriodicalId\":118077,\"journal\":{\"name\":\"2022 14th International Conference on Communications (COMM)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Communications (COMM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/comm54429.2022.9817305\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Communications (COMM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/comm54429.2022.9817305","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative Results of Spam Email Detection Using Machine Learning Algorithms
Among the problems caused by spam email are loss of productivity and increase in network resources consumption. Sometimes spam email contain malware as attachments or include links for phishing websites, leading to theft and loss of data. Many email servers are filtering spam but the process becomes increasingly difficult as spammers try to create messages that look similar to normal email. In this paper we implemented five Machine Learning Algorithms in the Python language using the scikit-learn library and we compared their performance against two publicly available spam email corpuses. The discussed algorithms are: Support Vector Machine, Random Forest, Logistic Regression, Multinomial Naive Bayes and Gaussian Naive Bayes.