基于共形马丁格尔框架的半监督概念漂移检测与适应

IF 3.3 2区 计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS Journal of Process Control Pub Date : 2025-01-22 DOI:10.1016/j.jprocont.2025.103374
Yu Zhang, Ping Zhou, Ruiyao Zhang, Shaowen Lu, Tianyou Chai
{"title":"基于共形马丁格尔框架的半监督概念漂移检测与适应","authors":"Yu Zhang,&nbsp;Ping Zhou,&nbsp;Ruiyao Zhang,&nbsp;Shaowen Lu,&nbsp;Tianyou Chai","doi":"10.1016/j.jprocont.2025.103374","DOIUrl":null,"url":null,"abstract":"<div><div>In the realm of industrial applications for machine learning, multiple challenges are frequently encountered, such as concept drift (CD) and the prohibitive costs associated with data labeling. CD refers to the scenario where the underlying data distribution of the model shifts over time, potentially deteriorating model performance. Addressing these challenges, this paper proposes an innovative semi-supervised CD detection method, specifically designed to tackle both CD and the high costs of data labeling in regression tasks. Initially, considering the high expense of acquiring labeled data in industrial application scenarios, a semi-supervised learning strategy based on self-training is utilized. In this strategy, prediction intervals generated by Conformal Prediction (CP) are used to select high-reliability pseudo-labels. Furthermore, to effectively address CD in real-world industrial settings, the Conformal Martingale (CM) is employed for real-time detection. This framework detects changes by identifying increases in martingale values when CD occurs. Upon detection, the model is promptly retrained using the most recent data following the drift. Finally, the proposed method is validated through experiments conducted on three datasets: the UCI dataset, the alumina evaporation process dataset, and the blast furnace ironmaking dataset. Experimental results demonstrate that the proposed semi-supervised method significantly enhances the performance of the original training model. The detection method accurately identifies CD and notably reduces test errors through model retraining, thereby improving the effectiveness of the model in real-world industrial applications.</div></div>","PeriodicalId":50079,"journal":{"name":"Journal of Process Control","volume":"147 ","pages":"Article 103374"},"PeriodicalIF":3.3000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semi-supervised concept drift detection and adaptation based on conformal martingale framework\",\"authors\":\"Yu Zhang,&nbsp;Ping Zhou,&nbsp;Ruiyao Zhang,&nbsp;Shaowen Lu,&nbsp;Tianyou Chai\",\"doi\":\"10.1016/j.jprocont.2025.103374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the realm of industrial applications for machine learning, multiple challenges are frequently encountered, such as concept drift (CD) and the prohibitive costs associated with data labeling. CD refers to the scenario where the underlying data distribution of the model shifts over time, potentially deteriorating model performance. Addressing these challenges, this paper proposes an innovative semi-supervised CD detection method, specifically designed to tackle both CD and the high costs of data labeling in regression tasks. Initially, considering the high expense of acquiring labeled data in industrial application scenarios, a semi-supervised learning strategy based on self-training is utilized. In this strategy, prediction intervals generated by Conformal Prediction (CP) are used to select high-reliability pseudo-labels. Furthermore, to effectively address CD in real-world industrial settings, the Conformal Martingale (CM) is employed for real-time detection. This framework detects changes by identifying increases in martingale values when CD occurs. Upon detection, the model is promptly retrained using the most recent data following the drift. Finally, the proposed method is validated through experiments conducted on three datasets: the UCI dataset, the alumina evaporation process dataset, and the blast furnace ironmaking dataset. Experimental results demonstrate that the proposed semi-supervised method significantly enhances the performance of the original training model. The detection method accurately identifies CD and notably reduces test errors through model retraining, thereby improving the effectiveness of the model in real-world industrial applications.</div></div>\",\"PeriodicalId\":50079,\"journal\":{\"name\":\"Journal of Process Control\",\"volume\":\"147 \",\"pages\":\"Article 103374\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-01-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Process Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0959152425000022\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Process Control","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0959152425000022","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

在机器学习的工业应用领域,经常会遇到多种挑战,例如概念漂移(CD)和与数据标注相关的高昂成本。概念漂移指的是模型的基础数据分布随着时间的推移而发生变化,从而可能导致模型性能下降。为了应对这些挑战,本文提出了一种创新的半监督 CD 检测方法,专门用于解决回归任务中的 CD 和数据标注的高成本问题。首先,考虑到在工业应用场景中获取标记数据的高昂成本,本文采用了基于自我训练的半监督学习策略。在这一策略中,利用共形预测(CP)生成的预测区间来选择高可靠性的伪标签。此外,为了有效解决实际工业环境中的 CD 问题,还采用了共形马丁格尔(CM)进行实时检测。当 CD 发生时,该框架通过识别马氏值的增加来检测变化。一经检测到,就会立即使用漂移后的最新数据对模型进行重新训练。最后,通过在三个数据集(UCI 数据集、氧化铝蒸发过程数据集和高炉炼铁数据集)上进行实验,对所提出的方法进行了验证。实验结果表明,所提出的半监督方法显著提高了原始训练模型的性能。该检测方法能准确识别 CD,并通过模型再训练显著减少了测试误差,从而提高了模型在实际工业应用中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Semi-supervised concept drift detection and adaptation based on conformal martingale framework
In the realm of industrial applications for machine learning, multiple challenges are frequently encountered, such as concept drift (CD) and the prohibitive costs associated with data labeling. CD refers to the scenario where the underlying data distribution of the model shifts over time, potentially deteriorating model performance. Addressing these challenges, this paper proposes an innovative semi-supervised CD detection method, specifically designed to tackle both CD and the high costs of data labeling in regression tasks. Initially, considering the high expense of acquiring labeled data in industrial application scenarios, a semi-supervised learning strategy based on self-training is utilized. In this strategy, prediction intervals generated by Conformal Prediction (CP) are used to select high-reliability pseudo-labels. Furthermore, to effectively address CD in real-world industrial settings, the Conformal Martingale (CM) is employed for real-time detection. This framework detects changes by identifying increases in martingale values when CD occurs. Upon detection, the model is promptly retrained using the most recent data following the drift. Finally, the proposed method is validated through experiments conducted on three datasets: the UCI dataset, the alumina evaporation process dataset, and the blast furnace ironmaking dataset. Experimental results demonstrate that the proposed semi-supervised method significantly enhances the performance of the original training model. The detection method accurately identifies CD and notably reduces test errors through model retraining, thereby improving the effectiveness of the model in real-world industrial applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Process Control
Journal of Process Control 工程技术-工程:化工
CiteScore
7.00
自引率
11.90%
发文量
159
审稿时长
74 days
期刊介绍: This international journal covers the application of control theory, operations research, computer science and engineering principles to the solution of process control problems. In addition to the traditional chemical processing and manufacturing applications, the scope of process control problems involves a wide range of applications that includes energy processes, nano-technology, systems biology, bio-medical engineering, pharmaceutical processing technology, energy storage and conversion, smart grid, and data analytics among others. Papers on the theory in these areas will also be accepted provided the theoretical contribution is aimed at the application and the development of process control techniques. Topics covered include: • Control applications• Process monitoring• Plant-wide control• Process control systems• Control techniques and algorithms• Process modelling and simulation• Design methods Advanced design methods exclude well established and widely studied traditional design techniques such as PID tuning and its many variants. Applications in fields such as control of automotive engines, machinery and robotics are not deemed suitable unless a clear motivation for the relevance to process control is provided.
期刊最新文献
Editorial Board Subspace identification of Hammerstein models with interval uncertainties Adaptive design of delay timers for non-stationary process variables based on change detection and Bayesian estimation Mixed logical dynamical (MLD)-based Kalman filter for hybrid systems fault diagnosis A novel explainable propagation-based fault diagnosis approach for Clean-In-Place by establishing Boolean network model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1