Yue Pang;Min Zhang;Yanli Liu;Xiangbin Li;Yidi Wang;Yahang Huan;Zhuo Liu;Jin Li;Danshi Wang
{"title":"Large language model-based optical network log analysis using LLaMA2 with instruction tuning","authors":"Yue Pang;Min Zhang;Yanli Liu;Xiangbin Li;Yidi Wang;Yahang Huan;Zhuo Liu;Jin Li;Danshi Wang","doi":"10.1364/JOCN.527874","DOIUrl":null,"url":null,"abstract":"The optical network encompasses numerous devices and links, generating a significant volume of logs. Analyzing these logs is significant for network optimization, failure diagnosis, and health monitoring. However, the large-scale and diverse formats of optical network logs present several challenges, including the high cost and difficulty of manual processing, insufficient semantic understanding in existing analysis methods, and the strict requirements for data security and privacy. Generative artificial intelligence (GAI) with powerful language understanding and generation capabilities has the potential to address these challenges. Large language models (LLMs) as a concrete realization of GAI are well-suited for analyzing DCI logs, replacing human experts and enhancing accuracy. Additionally, LLMs enable intelligent interactions with network administrators, automating tasks and improving operational efficiency. Moreover, fine-tuning with open-source LLMs protects data privacy and enhances log analysis accuracy. Therefore, we introduce LLMs and propose a log analysis method with instruction tuning using LLaMA2 for log parsing, anomaly detection and classification, anomaly analysis, and report generation. Real log data extracted from the field-deployed network was used to design and construct instruction tuning datasets. We utilized the dataset for instruction tuning and demonstrated and evaluated the effectiveness of the proposed scheme. The results indicate that this scheme improves the performance of log analysis tasks, especially a 14% improvement in exact match rate for log parsing, a 13% improvement in F1-score for anomaly detection and classification, and a 23% improvement in usability for anomaly analysis, compared with the best baselines.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"16 11","pages":"1116-1132"},"PeriodicalIF":4.0000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10734084/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The optical network encompasses numerous devices and links, generating a significant volume of logs. Analyzing these logs is significant for network optimization, failure diagnosis, and health monitoring. However, the large-scale and diverse formats of optical network logs present several challenges, including the high cost and difficulty of manual processing, insufficient semantic understanding in existing analysis methods, and the strict requirements for data security and privacy. Generative artificial intelligence (GAI) with powerful language understanding and generation capabilities has the potential to address these challenges. Large language models (LLMs) as a concrete realization of GAI are well-suited for analyzing DCI logs, replacing human experts and enhancing accuracy. Additionally, LLMs enable intelligent interactions with network administrators, automating tasks and improving operational efficiency. Moreover, fine-tuning with open-source LLMs protects data privacy and enhances log analysis accuracy. Therefore, we introduce LLMs and propose a log analysis method with instruction tuning using LLaMA2 for log parsing, anomaly detection and classification, anomaly analysis, and report generation. Real log data extracted from the field-deployed network was used to design and construct instruction tuning datasets. We utilized the dataset for instruction tuning and demonstrated and evaluated the effectiveness of the proposed scheme. The results indicate that this scheme improves the performance of log analysis tasks, especially a 14% improvement in exact match rate for log parsing, a 13% improvement in F1-score for anomaly detection and classification, and a 23% improvement in usability for anomaly analysis, compared with the best baselines.
期刊介绍:
The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.