{"title":"使用代表性图像的基于增量聚类的垃圾邮件图像过滤","authors":"Yingying He, Wengang Man, Haibo He","doi":"10.1109/ICSSEM.2011.6081310","DOIUrl":null,"url":null,"abstract":"In this paper, an incremental spam images filtering (ISIF) approach based on visual similarity is proposed as one solution to two important realistic problems not dealt well by the existing spam image filtering techniques. One problem is how to update a model efficiently. Another is how to deal with the lack of normal email images. The basic idea of the ISIF approach is to incrementally learn what spam images look like through clustering spam images and selecting their representative images (RI), and then use the RI to classify unknown images. An ISIF filter can be updated by adding new RI, which can be done efficiently because the retraining process only focuses on the missed spam images rather than on expanded training data. Since the ISIF approach only cares about spam images, it avoids the difficulty of collecting enough normal email images. The experimental results on a real dataset for spam image filtering problem show that the incremental filter based on the ISIF approach can effectively detect spam images with high accuracy along with low false positive rate.","PeriodicalId":406311,"journal":{"name":"2011 International Conference on System science, Engineering design and Manufacturing informatization","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Incremental clustering-based spam image filtering using representative images\",\"authors\":\"Yingying He, Wengang Man, Haibo He\",\"doi\":\"10.1109/ICSSEM.2011.6081310\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an incremental spam images filtering (ISIF) approach based on visual similarity is proposed as one solution to two important realistic problems not dealt well by the existing spam image filtering techniques. One problem is how to update a model efficiently. Another is how to deal with the lack of normal email images. The basic idea of the ISIF approach is to incrementally learn what spam images look like through clustering spam images and selecting their representative images (RI), and then use the RI to classify unknown images. An ISIF filter can be updated by adding new RI, which can be done efficiently because the retraining process only focuses on the missed spam images rather than on expanded training data. Since the ISIF approach only cares about spam images, it avoids the difficulty of collecting enough normal email images. The experimental results on a real dataset for spam image filtering problem show that the incremental filter based on the ISIF approach can effectively detect spam images with high accuracy along with low false positive rate.\",\"PeriodicalId\":406311,\"journal\":{\"name\":\"2011 International Conference on System science, Engineering design and Manufacturing informatization\",\"volume\":\"95 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on System science, Engineering design and Manufacturing informatization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSEM.2011.6081310\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on System science, Engineering design and Manufacturing informatization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSEM.2011.6081310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Incremental clustering-based spam image filtering using representative images
In this paper, an incremental spam images filtering (ISIF) approach based on visual similarity is proposed as one solution to two important realistic problems not dealt well by the existing spam image filtering techniques. One problem is how to update a model efficiently. Another is how to deal with the lack of normal email images. The basic idea of the ISIF approach is to incrementally learn what spam images look like through clustering spam images and selecting their representative images (RI), and then use the RI to classify unknown images. An ISIF filter can be updated by adding new RI, which can be done efficiently because the retraining process only focuses on the missed spam images rather than on expanded training data. Since the ISIF approach only cares about spam images, it avoids the difficulty of collecting enough normal email images. The experimental results on a real dataset for spam image filtering problem show that the incremental filter based on the ISIF approach can effectively detect spam images with high accuracy along with low false positive rate.