利用聚类联合学习实现异构边缘云环境中的数据高效异常检测

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-10-19 DOI:10.1016/j.future.2024.107559

Zongpu Wei, Jinsong Wang, Zening Zhao, Kai Shi

{"title":"利用聚类联合学习实现异构边缘云环境中的数据高效异常检测","authors":"Zongpu Wei, Jinsong Wang, Zening Zhao, Kai Shi","doi":"10.1016/j.future.2024.107559","DOIUrl":null,"url":null,"abstract":"<div><div>Anomaly detection in edge–cloud scenarios stands as a critical means to ensure the security of network environment. Federated learning (FL)-based anomaly detection combines multiple data sources and ensures data privacy, making it a promising distributed detection method. However, FL-based anomaly detection system is usually affected by data heterogeneity and data bias, resulting in the inefficiency of data used for FL and the decline of detection performance. We propose an iterative federated clustering ensemble algorithm named IFCEA, in which we (1) establish a committee on the devices, and select the optimal participation for each device based on the evaluations of committee; (2) filter the clusters based on committee results, and exclude the biased clusters; (3) design an aggregation weight that reflects the degree of local distribution balance; (4) present a novel cluster initialization method, OneBiPartition, which adapts to the number of clusters and commences clustering federated task efficiently. IFCEA enhances the data quality used in FL-based anomaly detection system from two perspectives: device selection and participation weights, effectively addressing the issues of data heterogeneity and data bias faced during the FL training phase. Extensive experimental results on five network traffic datasets (the UNSW-NB15, CIC-IDS2017, CIC-IDS2018, CIC-DDoS2019 and BCCC-DDoS2024 datasets) demonstrate that our proposed framework outperforms in terms of detection metrics and convergence performance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"164 ","pages":"Article 107559"},"PeriodicalIF":6.2000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Toward data efficient anomaly detection in heterogeneous edge–cloud environments using clustered federated learning\",\"authors\":\"Zongpu Wei, Jinsong Wang, Zening Zhao, Kai Shi\",\"doi\":\"10.1016/j.future.2024.107559\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Anomaly detection in edge–cloud scenarios stands as a critical means to ensure the security of network environment. Federated learning (FL)-based anomaly detection combines multiple data sources and ensures data privacy, making it a promising distributed detection method. However, FL-based anomaly detection system is usually affected by data heterogeneity and data bias, resulting in the inefficiency of data used for FL and the decline of detection performance. We propose an iterative federated clustering ensemble algorithm named IFCEA, in which we (1) establish a committee on the devices, and select the optimal participation for each device based on the evaluations of committee; (2) filter the clusters based on committee results, and exclude the biased clusters; (3) design an aggregation weight that reflects the degree of local distribution balance; (4) present a novel cluster initialization method, OneBiPartition, which adapts to the number of clusters and commences clustering federated task efficiently. IFCEA enhances the data quality used in FL-based anomaly detection system from two perspectives: device selection and participation weights, effectively addressing the issues of data heterogeneity and data bias faced during the FL training phase. Extensive experimental results on five network traffic datasets (the UNSW-NB15, CIC-IDS2017, CIC-IDS2018, CIC-DDoS2019 and BCCC-DDoS2024 datasets) demonstrate that our proposed framework outperforms in terms of detection metrics and convergence performance.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"164 \",\"pages\":\"Article 107559\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24005235\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24005235","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

边缘云场景中的异常检测是确保网络环境安全的重要手段。基于联合学习（FL）的异常检测结合了多个数据源并确保了数据隐私，是一种很有前途的分布式检测方法。然而，基于联合学习的异常检测系统通常会受到数据异质性和数据偏差的影响，导致联合学习所使用的数据效率低下，检测性能下降。我们提出了一种名为 IFCEA 的迭代联合聚类集合算法，其中包括：（1）在设备上建立一个委员会，并根据委员会的评估结果为每个设备选择最优的参与方式；（2）根据委员会的结果过滤聚类，并排除有偏差的聚类；（3）设计一个反映局部分布平衡程度的聚合权重；（4）提出一种新颖的聚类初始化方法 OneBiPartition，该方法能适应聚类的数量并高效地开始聚类联合任务。IFCEA 从设备选择和参与权重两个方面提高了基于 FL 的异常检测系统的数据质量，有效解决了 FL 训练阶段面临的数据异构和数据偏差问题。在五个网络流量数据集（UNSW-NB15、CIC-IDS2017、CIC-IDS2018、CIC-DDoS2019 和 BCCC-DDoS2024 数据集）上进行的大量实验结果表明，我们提出的框架在检测指标和收敛性能方面都表现优异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Toward data efficient anomaly detection in heterogeneous edge–cloud environments using clustered federated learning

Anomaly detection in edge–cloud scenarios stands as a critical means to ensure the security of network environment. Federated learning (FL)-based anomaly detection combines multiple data sources and ensures data privacy, making it a promising distributed detection method. However, FL-based anomaly detection system is usually affected by data heterogeneity and data bias, resulting in the inefficiency of data used for FL and the decline of detection performance. We propose an iterative federated clustering ensemble algorithm named IFCEA, in which we (1) establish a committee on the devices, and select the optimal participation for each device based on the evaluations of committee; (2) filter the clusters based on committee results, and exclude the biased clusters; (3) design an aggregation weight that reflects the degree of local distribution balance; (4) present a novel cluster initialization method, OneBiPartition, which adapts to the number of clusters and commences clustering federated task efficiently. IFCEA enhances the data quality used in FL-based anomaly detection system from two perspectives: device selection and participation weights, effectively addressing the issues of data heterogeneity and data bias faced during the FL training phase. Extensive experimental results on five network traffic datasets (the UNSW-NB15, CIC-IDS2017, CIC-IDS2018, CIC-DDoS2019 and BCCC-DDoS2024 datasets) demonstrate that our proposed framework outperforms in terms of detection metrics and convergence performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.

期刊最新文献

Identifying runtime libraries in statically linked linux binaries High throughput edit distance computation on FPGA-based accelerators using HLS In silico framework for genome analysis Adaptive ensemble optimization for memory-related hyperparameters in retraining DNN at edge Convergence-aware optimal checkpointing for exploratory deep learning training jobs