Aum Patil, Amey Wadekar, Tanishq Gupta, Rohit Vijan, F. Kazi
{"title":"Explainable LSTM Model for Anomaly Detection in HDFS Log File using Layerwise Relevance Propagation","authors":"Aum Patil, Amey Wadekar, Tanishq Gupta, Rohit Vijan, F. Kazi","doi":"10.1109/IBSSC47189.2019.8973044","DOIUrl":null,"url":null,"abstract":"Anomaly detection has always been of utmost importance especially in log file systems. Many different supervised techniques have been explored to deal with this problem. Deep Learning approaches have shown huge promise in log file anomaly detection systems due to their superior ability to learn high level features and non-linearities eliminating the need for any domain specific knowledge or special pre-processing. But this increased performance comes at the cost of inexplicability of the outcomes resulting from the black-box nature of such models. In this paper, we propose a solution utilizing a LSTM-LRP (Long Short Term Memory - Layerwise Relevance Propagation) architecture for discrete event sequences which are obtained by processing log files using log keys derived from individual entries. We extend the idea of LSTM-LRP, used in NLP problems to Log file Systems. The model is evaluated on Hadoop Distributed File System (HDFS) logs where an interpretation for every timestep and every feature is provided. Our major concern in this paper is the interpretation of the results over accuracy of the model. This not only offers an interpretation of the outcomes but also helps build trust in the model by making sure that spurious correlations are avoided making it suitable for real life applications.","PeriodicalId":148941,"journal":{"name":"2019 IEEE Bombay Section Signature Conference (IBSSC)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Bombay Section Signature Conference (IBSSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IBSSC47189.2019.8973044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Anomaly detection has always been of utmost importance especially in log file systems. Many different supervised techniques have been explored to deal with this problem. Deep Learning approaches have shown huge promise in log file anomaly detection systems due to their superior ability to learn high level features and non-linearities eliminating the need for any domain specific knowledge or special pre-processing. But this increased performance comes at the cost of inexplicability of the outcomes resulting from the black-box nature of such models. In this paper, we propose a solution utilizing a LSTM-LRP (Long Short Term Memory - Layerwise Relevance Propagation) architecture for discrete event sequences which are obtained by processing log files using log keys derived from individual entries. We extend the idea of LSTM-LRP, used in NLP problems to Log file Systems. The model is evaluated on Hadoop Distributed File System (HDFS) logs where an interpretation for every timestep and every feature is provided. Our major concern in this paper is the interpretation of the results over accuracy of the model. This not only offers an interpretation of the outcomes but also helps build trust in the model by making sure that spurious correlations are avoided making it suitable for real life applications.