Susnata Bhattacharya;Biplob Ray;Ritesh Chugh;Steven Gordon
{"title":"An Online Parsing Framework for Semistructured Streaming System Logs of Internet of Things Systems","authors":"Susnata Bhattacharya;Biplob Ray;Ritesh Chugh;Steven Gordon","doi":"10.1109/OJIM.2022.3232650","DOIUrl":null,"url":null,"abstract":"This article presents a novel log abstraction framework based on neural open information extraction (OpenIE) and dynamic word embedding principles. Though various log parsing frameworks are proposed in the literature, the existing frameworks are modeled on predefined heuristics or auto-regressive methodologies that work well in offline scenarios. However, these frameworks are less suitable for dynamic self-adaptive systems, such as the Internet of Things (IoT), where the log outputs have diverse contextual variations and disparate time irregularities. Therefore, it is essential to move away from these traditional approaches and develop a systematic model that can effectively analyze log outputs in real-time and increase the system up-time of IoT networks so that they are almost always available. To address these needs, the proposed framework used OpenIE along with term frequency/inverse document frequency (TF/IDF) vectorization for constructing a set of relational triples (aka triple-sets). Additionally, a dynamic pretrained encoder–decoder architecture is utilized to imbibe the positional and contextualized information in its resultant outputs. The adopted methodology has enabled the proposed framework to extract richer word representations with dynamic contextualization of time-sensitive event logs to enhance further downstream activities, such as failure prediction and prognostic analysis of IoT networks. The proposed framework is evaluated on the system event log traces accumulated from a long range wide-area network (LoRaWAN) IoT gateway to proactively determine the probable causes of its various failure scenarios. Additionally, the study also provided a comparative analysis of its mathematical representations with that of the current state-of-the-art (SOTA) approaches to project the advantages and benefits of the proposed model, particularly from its data analytics standpoint.","PeriodicalId":100630,"journal":{"name":"IEEE Open Journal of Instrumentation and Measurement","volume":"2 ","pages":"1-18"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552935/10025401/10004508.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Instrumentation and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10004508/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This article presents a novel log abstraction framework based on neural open information extraction (OpenIE) and dynamic word embedding principles. Though various log parsing frameworks are proposed in the literature, the existing frameworks are modeled on predefined heuristics or auto-regressive methodologies that work well in offline scenarios. However, these frameworks are less suitable for dynamic self-adaptive systems, such as the Internet of Things (IoT), where the log outputs have diverse contextual variations and disparate time irregularities. Therefore, it is essential to move away from these traditional approaches and develop a systematic model that can effectively analyze log outputs in real-time and increase the system up-time of IoT networks so that they are almost always available. To address these needs, the proposed framework used OpenIE along with term frequency/inverse document frequency (TF/IDF) vectorization for constructing a set of relational triples (aka triple-sets). Additionally, a dynamic pretrained encoder–decoder architecture is utilized to imbibe the positional and contextualized information in its resultant outputs. The adopted methodology has enabled the proposed framework to extract richer word representations with dynamic contextualization of time-sensitive event logs to enhance further downstream activities, such as failure prediction and prognostic analysis of IoT networks. The proposed framework is evaluated on the system event log traces accumulated from a long range wide-area network (LoRaWAN) IoT gateway to proactively determine the probable causes of its various failure scenarios. Additionally, the study also provided a comparative analysis of its mathematical representations with that of the current state-of-the-art (SOTA) approaches to project the advantages and benefits of the proposed model, particularly from its data analytics standpoint.