{"title":"Troubleshooting distributed data analytics systems","authors":"Aidi Pi","doi":"10.1145/3366624.3368157","DOIUrl":null,"url":null,"abstract":"Data analytics applications are deployed on large-scale distributed systems. In order to ensure high performance, troubleshooting for such applications and underlying systems is critical. In this thesis, we focus on efficient log analysis for troubleshooting distributed data analytics systems. We made the following contributions. 1) We designed a tool that collects logs and resource metrics of distributed data analytics systems to facilitate troubleshooting processes. 2) We designed a log analysis tool that is able to extract semantic meaning from logs and automatically report potential anomalies by leveraging natural language processing approaches.","PeriodicalId":376496,"journal":{"name":"Proceedings of the 20th International Middleware Conference Doctoral Symposium","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference Doctoral Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366624.3368157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data analytics applications are deployed on large-scale distributed systems. In order to ensure high performance, troubleshooting for such applications and underlying systems is critical. In this thesis, we focus on efficient log analysis for troubleshooting distributed data analytics systems. We made the following contributions. 1) We designed a tool that collects logs and resource metrics of distributed data analytics systems to facilitate troubleshooting processes. 2) We designed a log analysis tool that is able to extract semantic meaning from logs and automatically report potential anomalies by leveraging natural language processing approaches.