{"title":"使用闭频繁集对恶意软件进行聚类","authors":"A. Sprague, Adam Rhodes, Gary Warner","doi":"10.1109/ICAT.2013.6684043","DOIUrl":null,"url":null,"abstract":"The static analysis of malwares at UAB starts with the receipt of about 5000 malwares each day. One of our goals is to cluster these malwares into families. Each malware is an executable. For processing, we represent each malware by the set of printable strings that it contains. A method we have pursued to cluster malwares into families starts with the data mining technique of generating frequent itemsets. It is difficult to generate frequent itemsets at low support thresholds, which is what our application demands. This paper discusses our successful efforts to overcome this barrier of low support threshold.","PeriodicalId":348701,"journal":{"name":"2013 XXIV International Conference on Information, Communication and Automation Technologies (ICAT)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using closed frequent sets to cluster malwares\",\"authors\":\"A. Sprague, Adam Rhodes, Gary Warner\",\"doi\":\"10.1109/ICAT.2013.6684043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The static analysis of malwares at UAB starts with the receipt of about 5000 malwares each day. One of our goals is to cluster these malwares into families. Each malware is an executable. For processing, we represent each malware by the set of printable strings that it contains. A method we have pursued to cluster malwares into families starts with the data mining technique of generating frequent itemsets. It is difficult to generate frequent itemsets at low support thresholds, which is what our application demands. This paper discusses our successful efforts to overcome this barrier of low support threshold.\",\"PeriodicalId\":348701,\"journal\":{\"name\":\"2013 XXIV International Conference on Information, Communication and Automation Technologies (ICAT)\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 XXIV International Conference on Information, Communication and Automation Technologies (ICAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAT.2013.6684043\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 XXIV International Conference on Information, Communication and Automation Technologies (ICAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAT.2013.6684043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The static analysis of malwares at UAB starts with the receipt of about 5000 malwares each day. One of our goals is to cluster these malwares into families. Each malware is an executable. For processing, we represent each malware by the set of printable strings that it contains. A method we have pursued to cluster malwares into families starts with the data mining technique of generating frequent itemsets. It is difficult to generate frequent itemsets at low support thresholds, which is what our application demands. This paper discusses our successful efforts to overcome this barrier of low support threshold.