Efficiently localizing system anomalies for cloud infrastructures: a novel Dynamic Graph Transformer based Parallel Framework

Hongxia He, Xi Li, Peng Chen, Juan Chen, Ming Liu, Lei Wu
{"title":"Efficiently localizing system anomalies for cloud infrastructures: a novel Dynamic Graph Transformer based Parallel Framework","authors":"Hongxia He, Xi Li, Peng Chen, Juan Chen, Ming Liu, Lei Wu","doi":"10.1186/s13677-024-00677-x","DOIUrl":null,"url":null,"abstract":"Cloud environment is a virtual, online, and distributed computing environment that provides users with large-scale services. And cloud monitoring plays an integral role in protecting infrastructures in the cloud environment. Cloud monitoring systems need to closely monitor various KPIs of cloud resources, to accurately detect anomalies. However, due to the complexity and highly dynamic nature of the cloud environment, anomaly detection for these KPIs with various patterns and data quality is a huge challenge, especially those massive unlabeled data. Besides, it’s also difficult to improve the accuracy of the existing anomaly detection methods. To solve these problems, we propose a novel Dynamic Graph Transformer based Parallel Framework (DGT-PF) for efficiently detect system anomalies in cloud infrastructures, which utilizes Transformer with anomaly attention mechanism and Graph Neural Network (GNN) to learn the spatio-temporal features of KPIs to improve the accuracy and timeliness of model anomaly detection. Specifically, we propose an effective dynamic relationship embedding strategy to dynamically learn spatio-temporal features and adaptively generate adjacency matrices, and soft cluster each GNN layer through Diffpooling module. In addition, we also use nonlinear neural network model and AR-MLP model in parallel to obtain better detection accuracy and improve detection performance. The experiment shows that the DGT-PF framework have achieved the highest F1-Score on 5 public datasets, with an average improvement of 21.6% compared to 11 anomaly detection models.","PeriodicalId":501257,"journal":{"name":"Journal of Cloud Computing","volume":"70 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13677-024-00677-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cloud environment is a virtual, online, and distributed computing environment that provides users with large-scale services. And cloud monitoring plays an integral role in protecting infrastructures in the cloud environment. Cloud monitoring systems need to closely monitor various KPIs of cloud resources, to accurately detect anomalies. However, due to the complexity and highly dynamic nature of the cloud environment, anomaly detection for these KPIs with various patterns and data quality is a huge challenge, especially those massive unlabeled data. Besides, it’s also difficult to improve the accuracy of the existing anomaly detection methods. To solve these problems, we propose a novel Dynamic Graph Transformer based Parallel Framework (DGT-PF) for efficiently detect system anomalies in cloud infrastructures, which utilizes Transformer with anomaly attention mechanism and Graph Neural Network (GNN) to learn the spatio-temporal features of KPIs to improve the accuracy and timeliness of model anomaly detection. Specifically, we propose an effective dynamic relationship embedding strategy to dynamically learn spatio-temporal features and adaptively generate adjacency matrices, and soft cluster each GNN layer through Diffpooling module. In addition, we also use nonlinear neural network model and AR-MLP model in parallel to obtain better detection accuracy and improve detection performance. The experiment shows that the DGT-PF framework have achieved the highest F1-Score on 5 public datasets, with an average improvement of 21.6% compared to 11 anomaly detection models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高效定位云基础设施的系统异常:基于动态图变换器的新型并行框架
云环境是一种虚拟、在线和分布式计算环境,可为用户提供大规模服务。云监控在保护云环境中的基础设施方面发挥着不可或缺的作用。云监控系统需要密切监控云资源的各种关键绩效指标,以准确检测异常情况。然而,由于云环境的复杂性和高度动态性,对这些具有不同模式和数据质量的 KPI 进行异常检测是一项巨大的挑战,尤其是那些海量的无标记数据。此外,提高现有异常检测方法的准确性也很困难。为了解决这些问题,我们提出了一种新颖的基于动态图变换器的并行框架(DGT-PF),用于有效检测云基础设施中的系统异常,该框架利用带有异常关注机制的变换器和图神经网络(GNN)来学习 KPI 的时空特征,从而提高模型异常检测的准确性和及时性。具体来说,我们提出了一种有效的动态关系嵌入策略,以动态学习时空特征并自适应生成邻接矩阵,并通过 Diffpooling 模块对每个 GNN 层进行软聚类。此外,我们还并行使用了非线性神经网络模型和 AR-MLP 模型,以获得更好的检测精度,提高检测性能。实验结果表明,DGT-PF 框架在 5 个公共数据集上取得了最高的 F1 分数,与 11 个异常检测模型相比,平均提高了 21.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A cost-efficient content distribution optimization model for fog-based content delivery networks Toward security quantification of serverless computing SMedIR: secure medical image retrieval framework with ConvNeXt-based indexing and searchable encryption in the cloud A trusted IoT data sharing method based on secure multi-party computation Wind power prediction method based on cloud computing and data privacy protection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1