{"title":"基于局部特征选择分类器的人工免疫系统垃圾邮件过滤","authors":"Mayank Kalbhor, S. Shrivastava, Babita Ujjainiya","doi":"10.1109/ICCCNT.2013.6726691","DOIUrl":null,"url":null,"abstract":"The Local Concentration based feature extraction approach is take into consideration to be able to very effectively extract position related information from messages by transforming every area of a message to a corresponding LC feature. To include the LC approach into the entire process of spam filtering, a LC model is designed, where two kinds of detector sets are initially generated by using term selection strategies and a well-defined tendency threshold, then a window is applied to divide the message into local areas. After segmentation of the particular message, concentration of the detectors are calculated and brought as the feature for every local area. Finally, feature vector is created by combining all the local feature area. Then appropriate classification method inspired from immune system is applied on available feature vector. To check the performance of model, several experiments are conducted on four benchmark corpora using the cross-validation methodology. It is shown that our model performs well with the Information Gain as term selection methods, LC based feature extraction method with flexible applicability in the real world. In comparison of other global-concentration based feature extraction techniques like bag-of-word the LC approach has better performance in terms of both accuracy and measure. It is also demonstrated that the LC approach with artificial immune system inspired classifier gives better results against all parameters.","PeriodicalId":6330,"journal":{"name":"2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)","volume":"382 1","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An artificial immune system with local feature selection classifier for spam filtering\",\"authors\":\"Mayank Kalbhor, S. Shrivastava, Babita Ujjainiya\",\"doi\":\"10.1109/ICCCNT.2013.6726691\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Local Concentration based feature extraction approach is take into consideration to be able to very effectively extract position related information from messages by transforming every area of a message to a corresponding LC feature. To include the LC approach into the entire process of spam filtering, a LC model is designed, where two kinds of detector sets are initially generated by using term selection strategies and a well-defined tendency threshold, then a window is applied to divide the message into local areas. After segmentation of the particular message, concentration of the detectors are calculated and brought as the feature for every local area. Finally, feature vector is created by combining all the local feature area. Then appropriate classification method inspired from immune system is applied on available feature vector. To check the performance of model, several experiments are conducted on four benchmark corpora using the cross-validation methodology. It is shown that our model performs well with the Information Gain as term selection methods, LC based feature extraction method with flexible applicability in the real world. In comparison of other global-concentration based feature extraction techniques like bag-of-word the LC approach has better performance in terms of both accuracy and measure. It is also demonstrated that the LC approach with artificial immune system inspired classifier gives better results against all parameters.\",\"PeriodicalId\":6330,\"journal\":{\"name\":\"2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)\",\"volume\":\"382 1\",\"pages\":\"1-7\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCNT.2013.6726691\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCNT.2013.6726691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An artificial immune system with local feature selection classifier for spam filtering
The Local Concentration based feature extraction approach is take into consideration to be able to very effectively extract position related information from messages by transforming every area of a message to a corresponding LC feature. To include the LC approach into the entire process of spam filtering, a LC model is designed, where two kinds of detector sets are initially generated by using term selection strategies and a well-defined tendency threshold, then a window is applied to divide the message into local areas. After segmentation of the particular message, concentration of the detectors are calculated and brought as the feature for every local area. Finally, feature vector is created by combining all the local feature area. Then appropriate classification method inspired from immune system is applied on available feature vector. To check the performance of model, several experiments are conducted on four benchmark corpora using the cross-validation methodology. It is shown that our model performs well with the Information Gain as term selection methods, LC based feature extraction method with flexible applicability in the real world. In comparison of other global-concentration based feature extraction techniques like bag-of-word the LC approach has better performance in terms of both accuracy and measure. It is also demonstrated that the LC approach with artificial immune system inspired classifier gives better results against all parameters.