对基于日志的故障预测深度学习模型进行系统评估

IF 3.5 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-06-20 DOI:10.1007/s10664-024-10501-4
Fatemeh Hadadi, Joshua H. Dawes, Donghwan Shin, Domenico Bianculli, Lionel Briand
{"title":"对基于日志的故障预测深度学习模型进行系统评估","authors":"Fatemeh Hadadi, Joshua H. Dawes, Donghwan Shin, Domenico Bianculli, Lionel Briand","doi":"10.1007/s10664-024-10501-4","DOIUrl":null,"url":null,"abstract":"<p>With the increasing complexity and scope of software systems, their dependability is crucial. The analysis of log data recorded during system execution can enable engineers to automatically predict failures at run time. Several Machine Learning (ML) techniques, including traditional ML and Deep Learning (DL), have been proposed to automate such tasks. However, current empirical studies are limited in terms of covering all main DL types—Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and transformer—as well as examining them on a wide range of diverse datasets. In this paper, we aim to address these issues by systematically investigating the combination of log data embedding strategies and DL types for failure prediction. To that end, we propose a modular architecture to accommodate various configurations of embedding strategies and DL-based encoders. To further investigate how dataset characteristics such as dataset size and failure percentage affect model accuracy, we synthesised 360 datasets, with varying characteristics, for three distinct system behavioural models, based on a systematic and automated generation approach. Using the F1 score metric, our results show that the best overall performing configuration is a CNN-based encoder with Logkey2vec. Additionally, we provide specific dataset conditions, namely a dataset size <span>\\(&gt;350\\)</span> or a failure percentage <span>\\(&gt;7.5\\%\\)</span>, under which this configuration demonstrates high accuracy for failure prediction.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Systematic Evaluation of Deep Learning Models for Log-based Failure Prediction\",\"authors\":\"Fatemeh Hadadi, Joshua H. Dawes, Donghwan Shin, Domenico Bianculli, Lionel Briand\",\"doi\":\"10.1007/s10664-024-10501-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the increasing complexity and scope of software systems, their dependability is crucial. The analysis of log data recorded during system execution can enable engineers to automatically predict failures at run time. Several Machine Learning (ML) techniques, including traditional ML and Deep Learning (DL), have been proposed to automate such tasks. However, current empirical studies are limited in terms of covering all main DL types—Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and transformer—as well as examining them on a wide range of diverse datasets. In this paper, we aim to address these issues by systematically investigating the combination of log data embedding strategies and DL types for failure prediction. To that end, we propose a modular architecture to accommodate various configurations of embedding strategies and DL-based encoders. To further investigate how dataset characteristics such as dataset size and failure percentage affect model accuracy, we synthesised 360 datasets, with varying characteristics, for three distinct system behavioural models, based on a systematic and automated generation approach. Using the F1 score metric, our results show that the best overall performing configuration is a CNN-based encoder with Logkey2vec. Additionally, we provide specific dataset conditions, namely a dataset size <span>\\\\(&gt;350\\\\)</span> or a failure percentage <span>\\\\(&gt;7.5\\\\%\\\\)</span>, under which this configuration demonstrates high accuracy for failure prediction.</p>\",\"PeriodicalId\":11525,\"journal\":{\"name\":\"Empirical Software Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Empirical Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10664-024-10501-4\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10501-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

随着软件系统的复杂性和范围不断增加,其可靠性至关重要。通过分析系统执行过程中记录的日志数据,工程师可以在运行时自动预测故障。目前已经提出了几种机器学习(ML)技术,包括传统的 ML 和深度学习(DL),用于自动完成此类任务。然而,目前的实证研究在涵盖所有主要的深度学习类型--递归神经网络(RNN)、卷积神经网络(CNN)和变压器--以及在各种不同的数据集上对它们进行检验方面都很有限。在本文中,我们旨在通过系统地研究日志数据嵌入策略和故障预测 DL 类型的组合来解决这些问题。为此,我们提出了一种模块化架构,以适应嵌入策略和基于 DL 的编码器的各种配置。为了进一步研究数据集的特征(如数据集大小和故障百分比)对模型准确性的影响,我们基于系统化的自动生成方法,为三种不同的系统行为模型合成了 360 个特征各异的数据集。使用 F1 分数指标,我们的结果表明,整体性能最佳的配置是基于 CNN 的编码器和 Logkey2vec。此外,我们还提供了特定的数据集条件,即数据集大小(350)或故障百分比(7.5%),在这些条件下,该配置的故障预测准确率很高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Systematic Evaluation of Deep Learning Models for Log-based Failure Prediction

With the increasing complexity and scope of software systems, their dependability is crucial. The analysis of log data recorded during system execution can enable engineers to automatically predict failures at run time. Several Machine Learning (ML) techniques, including traditional ML and Deep Learning (DL), have been proposed to automate such tasks. However, current empirical studies are limited in terms of covering all main DL types—Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and transformer—as well as examining them on a wide range of diverse datasets. In this paper, we aim to address these issues by systematically investigating the combination of log data embedding strategies and DL types for failure prediction. To that end, we propose a modular architecture to accommodate various configurations of embedding strategies and DL-based encoders. To further investigate how dataset characteristics such as dataset size and failure percentage affect model accuracy, we synthesised 360 datasets, with varying characteristics, for three distinct system behavioural models, based on a systematic and automated generation approach. Using the F1 score metric, our results show that the best overall performing configuration is a CNN-based encoder with Logkey2vec. Additionally, we provide specific dataset conditions, namely a dataset size \(>350\) or a failure percentage \(>7.5\%\), under which this configuration demonstrates high accuracy for failure prediction.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Empirical Software Engineering
Empirical Software Engineering 工程技术-计算机:软件工程
CiteScore
8.50
自引率
12.20%
发文量
169
审稿时长
>12 weeks
期刊介绍: Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.
期刊最新文献
An empirical study on developers’ shared conversations with ChatGPT in GitHub pull requests and issues Quality issues in machine learning software systems An empirical study of token-based micro commits Software product line testing: a systematic literature review Consensus task interaction trace recommender to guide developers’ software navigation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1