Zesen Wang, Yonggang Li, Chunhua Yang, Hongqiu Zhu, Can Zhou
{"title":"Graph-based active semi-supervised learning: Case study in water quality monitoring","authors":"Zesen Wang, Yonggang Li, Chunhua Yang, Hongqiu Zhu, Can Zhou","doi":"10.1016/j.aei.2024.102902","DOIUrl":null,"url":null,"abstract":"<div><div>Process monitoring is a key technology in the field of industrial production and manufacturing, where machine learning algorithms play a crucial role. However, the cost of data collection in industrial settings is very high, which seriously limits the performance improvement of monitoring models. To address this issue, a graph-based active semi-supervised learning (GASSL) strategy is proposed, which can derive reliable monitoring models with limited labeling costs. Specifically, first, a robust unsupervised active learning (RUAL) method is proposed, which incorporates data reconstruction, low-rank representation, and manifold learning into a unified framework to select the most representative samples for labeling, avoiding the poor performance of model-based active learning algorithms under the condition of limited initial sample size. Second, to maximize the use of the remaining unlabeled samples after labeling, pseudo-labels are assigned to the unlabeled samples through label propagation, thereby further expanding the sample set. At the same time, active learning selects the most valuable samples as the labeled node set of the graph model, strengthening the performance of label propagation. Experimental results on three datasets related to water quality monitoring, including public dataset, simulation dataset, and real total nitrogen detection dataset, extensively demonstrate the effectiveness of the proposed method.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"62 ","pages":"Article 102902"},"PeriodicalIF":8.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034624005536","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Process monitoring is a key technology in the field of industrial production and manufacturing, where machine learning algorithms play a crucial role. However, the cost of data collection in industrial settings is very high, which seriously limits the performance improvement of monitoring models. To address this issue, a graph-based active semi-supervised learning (GASSL) strategy is proposed, which can derive reliable monitoring models with limited labeling costs. Specifically, first, a robust unsupervised active learning (RUAL) method is proposed, which incorporates data reconstruction, low-rank representation, and manifold learning into a unified framework to select the most representative samples for labeling, avoiding the poor performance of model-based active learning algorithms under the condition of limited initial sample size. Second, to maximize the use of the remaining unlabeled samples after labeling, pseudo-labels are assigned to the unlabeled samples through label propagation, thereby further expanding the sample set. At the same time, active learning selects the most valuable samples as the labeled node set of the graph model, strengthening the performance of label propagation. Experimental results on three datasets related to water quality monitoring, including public dataset, simulation dataset, and real total nitrogen detection dataset, extensively demonstrate the effectiveness of the proposed method.
期刊介绍:
Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.