{"title":"BotDetector:一种检测恶意软件感染设备的强大且可扩展的方法","authors":"Sho Mizuno, Mitsuhiro Hatada, Tatsuya Mori, Shigeki Goto","doi":"10.1109/ICC.2017.7997372","DOIUrl":null,"url":null,"abstract":"Damage caused by malware is a serious problem that needs to be addressed. The recent rise in the spread of evasive malware has made it difficult to detect it at the pre-infection timing. Malware detection at post-infection timing is a promising approach that fulfills this gap. Given this background, this work aims to identify likely malware-infected devices from the measurement of Internet traffic. The advantage of the traffic-measurement-based approach is that it enables us to monitor a large number of clients. If we find a client as a source of malicious traffic, the client is likely a malware-infected device. Since the majority of malware today makes use of the web as a means to communicate with the C&C servers that reside on the external network, we leverage information recorded in the HTTP headers to discriminate between malicious and legitimate traffic. To make our approach scalable and robust, we develop the automatic template generation scheme that drastically reduces the amount of information to be kept while achieving the high accuracy of classification; since it does not make use of any domain knowledge, the approach should be robust against changes of malware. We apply several classifiers, which include machine learning algorithms, to the extracted templates and classify traffic into two categories: malicious and legitimate. Our extensive experiments demonstrate that our approach discriminates between malicious and legitimate traffic with up to 97.1% precision while maintaining the false positive below 1.0%.","PeriodicalId":6517,"journal":{"name":"2017 IEEE International Conference on Communications (ICC)","volume":"46 1","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"BotDetector: A robust and scalable approach toward detecting malware-infected devices\",\"authors\":\"Sho Mizuno, Mitsuhiro Hatada, Tatsuya Mori, Shigeki Goto\",\"doi\":\"10.1109/ICC.2017.7997372\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Damage caused by malware is a serious problem that needs to be addressed. The recent rise in the spread of evasive malware has made it difficult to detect it at the pre-infection timing. Malware detection at post-infection timing is a promising approach that fulfills this gap. Given this background, this work aims to identify likely malware-infected devices from the measurement of Internet traffic. The advantage of the traffic-measurement-based approach is that it enables us to monitor a large number of clients. If we find a client as a source of malicious traffic, the client is likely a malware-infected device. Since the majority of malware today makes use of the web as a means to communicate with the C&C servers that reside on the external network, we leverage information recorded in the HTTP headers to discriminate between malicious and legitimate traffic. To make our approach scalable and robust, we develop the automatic template generation scheme that drastically reduces the amount of information to be kept while achieving the high accuracy of classification; since it does not make use of any domain knowledge, the approach should be robust against changes of malware. We apply several classifiers, which include machine learning algorithms, to the extracted templates and classify traffic into two categories: malicious and legitimate. Our extensive experiments demonstrate that our approach discriminates between malicious and legitimate traffic with up to 97.1% precision while maintaining the false positive below 1.0%.\",\"PeriodicalId\":6517,\"journal\":{\"name\":\"2017 IEEE International Conference on Communications (ICC)\",\"volume\":\"46 1\",\"pages\":\"1-7\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Communications (ICC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICC.2017.7997372\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Communications (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC.2017.7997372","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
BotDetector: A robust and scalable approach toward detecting malware-infected devices
Damage caused by malware is a serious problem that needs to be addressed. The recent rise in the spread of evasive malware has made it difficult to detect it at the pre-infection timing. Malware detection at post-infection timing is a promising approach that fulfills this gap. Given this background, this work aims to identify likely malware-infected devices from the measurement of Internet traffic. The advantage of the traffic-measurement-based approach is that it enables us to monitor a large number of clients. If we find a client as a source of malicious traffic, the client is likely a malware-infected device. Since the majority of malware today makes use of the web as a means to communicate with the C&C servers that reside on the external network, we leverage information recorded in the HTTP headers to discriminate between malicious and legitimate traffic. To make our approach scalable and robust, we develop the automatic template generation scheme that drastically reduces the amount of information to be kept while achieving the high accuracy of classification; since it does not make use of any domain knowledge, the approach should be robust against changes of malware. We apply several classifiers, which include machine learning algorithms, to the extracted templates and classify traffic into two categories: malicious and legitimate. Our extensive experiments demonstrate that our approach discriminates between malicious and legitimate traffic with up to 97.1% precision while maintaining the false positive below 1.0%.