{"title":"垃圾邮件过滤器性能和抗攻击鲁棒性的实验评估","authors":"Steve Webb, Subramanyam Chitti, C. Pu","doi":"10.1109/COLCOM.2005.1651219","DOIUrl":null,"url":null,"abstract":"In this paper, we show experimentally that learning filters are able to classify large corpora of spam and legitimate email messages with a high degree of accuracy. The corpora in our experiments contain about half a million spam messages and a similar number of legitimate messages, making them two orders of magnitude larger than the corpora used in current research. The use of such large corpora represents a collaborative approach to spam filtering because the corpora combine spam and legitimate messages from many different sources. First, we show that this collaborative approach creates very accurate spam filters. Then, we introduce an effective attack against these filters which successfully degrades their ability to classify spam. Finally, we present an effective solution to the above attack which involves retraining the filters to accurately identify the attack messages","PeriodicalId":365186,"journal":{"name":"2005 International Conference on Collaborative Computing: Networking, Applications and Worksharing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"An experimental evaluation of spam filter performance and robustness against attack\",\"authors\":\"Steve Webb, Subramanyam Chitti, C. Pu\",\"doi\":\"10.1109/COLCOM.2005.1651219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we show experimentally that learning filters are able to classify large corpora of spam and legitimate email messages with a high degree of accuracy. The corpora in our experiments contain about half a million spam messages and a similar number of legitimate messages, making them two orders of magnitude larger than the corpora used in current research. The use of such large corpora represents a collaborative approach to spam filtering because the corpora combine spam and legitimate messages from many different sources. First, we show that this collaborative approach creates very accurate spam filters. Then, we introduce an effective attack against these filters which successfully degrades their ability to classify spam. Finally, we present an effective solution to the above attack which involves retraining the filters to accurately identify the attack messages\",\"PeriodicalId\":365186,\"journal\":{\"name\":\"2005 International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 International Conference on Collaborative Computing: Networking, Applications and Worksharing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COLCOM.2005.1651219\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 International Conference on Collaborative Computing: Networking, Applications and Worksharing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COLCOM.2005.1651219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An experimental evaluation of spam filter performance and robustness against attack
In this paper, we show experimentally that learning filters are able to classify large corpora of spam and legitimate email messages with a high degree of accuracy. The corpora in our experiments contain about half a million spam messages and a similar number of legitimate messages, making them two orders of magnitude larger than the corpora used in current research. The use of such large corpora represents a collaborative approach to spam filtering because the corpora combine spam and legitimate messages from many different sources. First, we show that this collaborative approach creates very accurate spam filters. Then, we introduce an effective attack against these filters which successfully degrades their ability to classify spam. Finally, we present an effective solution to the above attack which involves retraining the filters to accurately identify the attack messages