{"title":"Active learning of optical path classification","authors":"Paweł Cichosz","doi":"10.1016/j.engappai.2025.110582","DOIUrl":null,"url":null,"abstract":"<div><div>Creating classification models to predict whether an optical channel can provide a required level of transmission quality is a promising approach to automated path quality assessment in optical network design. The applicability of machine learning algorithms in this domain is limited, however, by the cost, effort, and time needed to collect sufficient labeled data for model creation. The necessary amount of labeled data may be substantially reduced by active learning. This is an iterative process, creating a sequence of models, where each model is used to select the most useful paths for a class labeling query which are then added to the training set for the next model. Such a learning scenario poses different challenges for machine learning algorithms than standard “passive” learning, since they have to deal with very small and often imbalanced data. This work examines how these challenges are handled by algorithms that have been found particularly useful for optical path classification by prior studies. The random forest and extreme gradient boosting algorithms are applied to active learning from a real dataset provided by a network operator, starting only from a handful of training instances. Uncertainty sampling and diversity sampling are used for query selection. Confidence-based and stability-based stopping criteria are used to determine when the process can be safely terminated. The results confirm that active learning is a useful approach to creating optical path classification models, making it possible to save more than a half of the cost needed to provide class labels for model creation.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110582"},"PeriodicalIF":8.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625005822","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Creating classification models to predict whether an optical channel can provide a required level of transmission quality is a promising approach to automated path quality assessment in optical network design. The applicability of machine learning algorithms in this domain is limited, however, by the cost, effort, and time needed to collect sufficient labeled data for model creation. The necessary amount of labeled data may be substantially reduced by active learning. This is an iterative process, creating a sequence of models, where each model is used to select the most useful paths for a class labeling query which are then added to the training set for the next model. Such a learning scenario poses different challenges for machine learning algorithms than standard “passive” learning, since they have to deal with very small and often imbalanced data. This work examines how these challenges are handled by algorithms that have been found particularly useful for optical path classification by prior studies. The random forest and extreme gradient boosting algorithms are applied to active learning from a real dataset provided by a network operator, starting only from a handful of training instances. Uncertainty sampling and diversity sampling are used for query selection. Confidence-based and stability-based stopping criteria are used to determine when the process can be safely terminated. The results confirm that active learning is a useful approach to creating optical path classification models, making it possible to save more than a half of the cost needed to provide class labels for model creation.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.