Chunyu Zhang;Yu Chen;Min Zhang;Zhuo Liu;Danshi Wang
{"title":"SHAP-assisted EE-LightGBM model for explainable fault diagnosis in practical optical networks","authors":"Chunyu Zhang;Yu Chen;Min Zhang;Zhuo Liu;Danshi Wang","doi":"10.1364/JOCN.527872","DOIUrl":null,"url":null,"abstract":"Reliable fault diagnosis is crucial for ensuring the stable operation of optical networks. Recently, data-driven techniques have demonstrated significant advantages in fault diagnosis due to their outstanding data-processing capabilities and adaptive learning abilities. However, as equipment faults in practical optical networks are rare events, the data collected often faces severe data imbalance issues, greatly limiting the accuracy of traditional data-driven models. To address this challenge, a SHAP-assisted EE-LightGBM scheme is proposed for explainable fault diagnosis in practical optical networks. The EE-LightGBM model integrates undersampling strategies at the data level and hybrid ensemble strategies at the model level, enabling the full utilization of fewer fault samples and effectively alleviating the impact of data imbalance on model training. Furthermore, the SHAP method is used to explain the EE-LightGBM model. This method quantifies the contributions of input features to the model’s decision outputs, facilitating a deeper understanding of the mechanisms underlying faults in the equipment and improving the model’s explainability. Through SHAP analysis, we can determine key features highly correlated with equipment faults, thereby inferring the causes of equipment faults. Evaluation using data from backbone network equipment managed by operators shows excellent detection performance of the EE-LightGBM model at a data imbalance rate of 5.61%, with accuracy and F1 scores of 0.9968 and 0.9711, and false negative and false positive rates of 0.0033 and 0.0032, respectively. Moreover, the cause identification results are consistent with diagnostic expertise. We also explore the impact of data imbalance rates on the detection performance of the EE-LightGBM model. The model’s low false negative rate under data imbalance further demonstrates its effectiveness in practical optical network fault diagnosis.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 2","pages":"81-94"},"PeriodicalIF":4.0000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10844950/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Reliable fault diagnosis is crucial for ensuring the stable operation of optical networks. Recently, data-driven techniques have demonstrated significant advantages in fault diagnosis due to their outstanding data-processing capabilities and adaptive learning abilities. However, as equipment faults in practical optical networks are rare events, the data collected often faces severe data imbalance issues, greatly limiting the accuracy of traditional data-driven models. To address this challenge, a SHAP-assisted EE-LightGBM scheme is proposed for explainable fault diagnosis in practical optical networks. The EE-LightGBM model integrates undersampling strategies at the data level and hybrid ensemble strategies at the model level, enabling the full utilization of fewer fault samples and effectively alleviating the impact of data imbalance on model training. Furthermore, the SHAP method is used to explain the EE-LightGBM model. This method quantifies the contributions of input features to the model’s decision outputs, facilitating a deeper understanding of the mechanisms underlying faults in the equipment and improving the model’s explainability. Through SHAP analysis, we can determine key features highly correlated with equipment faults, thereby inferring the causes of equipment faults. Evaluation using data from backbone network equipment managed by operators shows excellent detection performance of the EE-LightGBM model at a data imbalance rate of 5.61%, with accuracy and F1 scores of 0.9968 and 0.9711, and false negative and false positive rates of 0.0033 and 0.0032, respectively. Moreover, the cause identification results are consistent with diagnostic expertise. We also explore the impact of data imbalance rates on the detection performance of the EE-LightGBM model. The model’s low false negative rate under data imbalance further demonstrates its effectiveness in practical optical network fault diagnosis.
期刊介绍:
The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.