Weibin Meng, Y. Liu, Shenglin Zhang, Dan Pei, Hui Dong, Lei Song, Xulong Luo
{"title":"基于部分标签的设备无关日志异常分类","authors":"Weibin Meng, Y. Liu, Shenglin Zhang, Dan Pei, Hui Dong, Lei Song, Xulong Luo","doi":"10.1109/IWQoS.2018.8624141","DOIUrl":null,"url":null,"abstract":"Anomaly classification, i.e., detecting whether a network device is anomalous and determining its anomaly category if yes, plays a crucial role in troubleshooting. Compared to KPI curves, device logs contain too much more valuable information for anomaly classification. However, the regular expression based anomaly classification techniques cannot tackle the challenges lying in log anomaly classification. We propose LogClass, a data-driven framework to detect and classify anomalies based on device logs. LogClass combines a word representation method and the PU learning model to construct device-agnostic vocabulary with partial labels. We evaluate LogClass on tens of millions of switch logs collected from several real-world datacenters owned by a top global search engine. Our results show that LogClass achieves 99.515% F1 score in anomalous log detection, 95.32% Macro-F1 and 99.74% Micro-F1 in anomalous log classification in a computationally efficient manner.","PeriodicalId":222290,"journal":{"name":"2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"Device-Agnostic Log Anomaly Classification with Partial Labels\",\"authors\":\"Weibin Meng, Y. Liu, Shenglin Zhang, Dan Pei, Hui Dong, Lei Song, Xulong Luo\",\"doi\":\"10.1109/IWQoS.2018.8624141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Anomaly classification, i.e., detecting whether a network device is anomalous and determining its anomaly category if yes, plays a crucial role in troubleshooting. Compared to KPI curves, device logs contain too much more valuable information for anomaly classification. However, the regular expression based anomaly classification techniques cannot tackle the challenges lying in log anomaly classification. We propose LogClass, a data-driven framework to detect and classify anomalies based on device logs. LogClass combines a word representation method and the PU learning model to construct device-agnostic vocabulary with partial labels. We evaluate LogClass on tens of millions of switch logs collected from several real-world datacenters owned by a top global search engine. Our results show that LogClass achieves 99.515% F1 score in anomalous log detection, 95.32% Macro-F1 and 99.74% Micro-F1 in anomalous log classification in a computationally efficient manner.\",\"PeriodicalId\":222290,\"journal\":{\"name\":\"2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWQoS.2018.8624141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS.2018.8624141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Device-Agnostic Log Anomaly Classification with Partial Labels
Anomaly classification, i.e., detecting whether a network device is anomalous and determining its anomaly category if yes, plays a crucial role in troubleshooting. Compared to KPI curves, device logs contain too much more valuable information for anomaly classification. However, the regular expression based anomaly classification techniques cannot tackle the challenges lying in log anomaly classification. We propose LogClass, a data-driven framework to detect and classify anomalies based on device logs. LogClass combines a word representation method and the PU learning model to construct device-agnostic vocabulary with partial labels. We evaluate LogClass on tens of millions of switch logs collected from several real-world datacenters owned by a top global search engine. Our results show that LogClass achieves 99.515% F1 score in anomalous log detection, 95.32% Macro-F1 and 99.74% Micro-F1 in anomalous log classification in a computationally efficient manner.