{"title":"基于阈值的朴素贝叶斯算法的优化","authors":"Xin Wang, Hua Jiang","doi":"10.1109/WGEC.2009.161","DOIUrl":null,"url":null,"abstract":"In order to realize the text classification and spam filtering, the Naive Bayesian algorithm estimate what class are the text in by basing on some statistical probability values in accordance with the characteristic in straining sample, but it is easy to expose the overflow problem, this article will optimize the algorithm by setting the threshold, the optimization strategy is comparing the times that the probability of each class exceed the threshold and the accumulated probability values at the same times. Compare with the existing method, experimental result show the new method not only can solve the overflow problem, but also improve the classification effect effectively.","PeriodicalId":277950,"journal":{"name":"2009 Third International Conference on Genetic and Evolutionary Computing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The Optimization of Threshold-Based Naive Bayesian Algorithm\",\"authors\":\"Xin Wang, Hua Jiang\",\"doi\":\"10.1109/WGEC.2009.161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to realize the text classification and spam filtering, the Naive Bayesian algorithm estimate what class are the text in by basing on some statistical probability values in accordance with the characteristic in straining sample, but it is easy to expose the overflow problem, this article will optimize the algorithm by setting the threshold, the optimization strategy is comparing the times that the probability of each class exceed the threshold and the accumulated probability values at the same times. Compare with the existing method, experimental result show the new method not only can solve the overflow problem, but also improve the classification effect effectively.\",\"PeriodicalId\":277950,\"journal\":{\"name\":\"2009 Third International Conference on Genetic and Evolutionary Computing\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Third International Conference on Genetic and Evolutionary Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WGEC.2009.161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Third International Conference on Genetic and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WGEC.2009.161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Optimization of Threshold-Based Naive Bayesian Algorithm
In order to realize the text classification and spam filtering, the Naive Bayesian algorithm estimate what class are the text in by basing on some statistical probability values in accordance with the characteristic in straining sample, but it is easy to expose the overflow problem, this article will optimize the algorithm by setting the threshold, the optimization strategy is comparing the times that the probability of each class exceed the threshold and the accumulated probability values at the same times. Compare with the existing method, experimental result show the new method not only can solve the overflow problem, but also improve the classification effect effectively.