Self-Starting Monitoring and Dynamic Sampling of High-Dimensional Data Streams

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-10-15 DOI:10.1109/TASE.2024.3474296
Jiahui Zhang;Ziqian Zheng;Jun Li;Kaibo Liu
{"title":"Self-Starting Monitoring and Dynamic Sampling of High-Dimensional Data Streams","authors":"Jiahui Zhang;Ziqian Zheng;Jun Li;Kaibo Liu","doi":"10.1109/TASE.2024.3474296","DOIUrl":null,"url":null,"abstract":"In today’s manufacturing industries, the development of sensor technology and Internet of Things has made real-time process monitoring of high-dimensional data increasingly vital. However, resource constraints, such as limited power, budget, and transmission capacity, often prevent access to full data streams in real time. This means that practitioners need to effectively monitor the process based on only partially observed data by dynamically deciding the sampling layout in real time. Another common challenge of process monitoring in practice is the lack of historical reference data, which can occur due to process/system upgrades or equipment replacements. To address these critical challenges, this paper proposes MASS (Monitoring with Adaptive Sampling under Self-starting scheme), a novel self-starting monitoring approach tailored to monitor high-dimensional data streams when only limited resources and historical reference data are available. Our monitoring framework is based on a quantile-based nonparametric CUSUM procedure with likelihood ratio-based statistics, and then the Thompson Sampling (TS) algorithm is adopted to handle partially observed data in the self-starting scenario. A key feature of our proposed method is its adaptive estimation of the out-of-control distribution and data quantiles, which ensures robust detection for various shifts in data streams with arbitrary and heterogeneous distributions, even in cases with limited reference data. The outperformance of the proposed method is demonstrated through simulation experiments and a real-world case study. Note to Practitioners—This paper is motivated by the critical challenges of online process monitoring when only limited resources (e.g., limited power availability, limited number of sensors, and limited transmission capacity) and limited historical in-control reference data are available. For example, consider a scenario where a newly established production system requires online monitoring across multiple data streams. In such cases, there is often a deficiency of reference data crucial for constructing a reliable control chart. Additionally, due to resource constraints, it is frequently infeasible to gather information from all streams associated with the production line at each epoch in real time. Unlike previous methods which require either a sufficient amount of reference data, or fully observable data streams, this paper proposes a novel monitoring and dynamic sampling scheme to effectively monitor partially observable data streams with only a small amount of reference data. To implement the methodology, it requires: (i) to initiate the process by estimating quantiles with a small amount of reference data, (ii) to determine which data streams to observe at each time epoch, (iii) to adaptively update the estimation of process parameters during online monitoring, and (iv) to construct a set of local and global statistics that can be used to quickly detect the system anomaly in real time. Numerical experiments and a real-world case study suggest that our proposed method efficiently leverages available data to reduce detection delays and enhance effectiveness against various shifts, in comparison to the benchmark methods.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"7897-7911"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10717438/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In today’s manufacturing industries, the development of sensor technology and Internet of Things has made real-time process monitoring of high-dimensional data increasingly vital. However, resource constraints, such as limited power, budget, and transmission capacity, often prevent access to full data streams in real time. This means that practitioners need to effectively monitor the process based on only partially observed data by dynamically deciding the sampling layout in real time. Another common challenge of process monitoring in practice is the lack of historical reference data, which can occur due to process/system upgrades or equipment replacements. To address these critical challenges, this paper proposes MASS (Monitoring with Adaptive Sampling under Self-starting scheme), a novel self-starting monitoring approach tailored to monitor high-dimensional data streams when only limited resources and historical reference data are available. Our monitoring framework is based on a quantile-based nonparametric CUSUM procedure with likelihood ratio-based statistics, and then the Thompson Sampling (TS) algorithm is adopted to handle partially observed data in the self-starting scenario. A key feature of our proposed method is its adaptive estimation of the out-of-control distribution and data quantiles, which ensures robust detection for various shifts in data streams with arbitrary and heterogeneous distributions, even in cases with limited reference data. The outperformance of the proposed method is demonstrated through simulation experiments and a real-world case study. Note to Practitioners—This paper is motivated by the critical challenges of online process monitoring when only limited resources (e.g., limited power availability, limited number of sensors, and limited transmission capacity) and limited historical in-control reference data are available. For example, consider a scenario where a newly established production system requires online monitoring across multiple data streams. In such cases, there is often a deficiency of reference data crucial for constructing a reliable control chart. Additionally, due to resource constraints, it is frequently infeasible to gather information from all streams associated with the production line at each epoch in real time. Unlike previous methods which require either a sufficient amount of reference data, or fully observable data streams, this paper proposes a novel monitoring and dynamic sampling scheme to effectively monitor partially observable data streams with only a small amount of reference data. To implement the methodology, it requires: (i) to initiate the process by estimating quantiles with a small amount of reference data, (ii) to determine which data streams to observe at each time epoch, (iii) to adaptively update the estimation of process parameters during online monitoring, and (iv) to construct a set of local and global statistics that can be used to quickly detect the system anomaly in real time. Numerical experiments and a real-world case study suggest that our proposed method efficiently leverages available data to reduce detection delays and enhance effectiveness against various shifts, in comparison to the benchmark methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高维数据流的自启动监控和动态采样
在当今的制造业中,传感器技术和物联网的发展使得高维数据的实时过程监控变得越来越重要。然而,资源的限制,如有限的功率、预算和传输容量,经常阻碍实时访问完整的数据流。这意味着从业者需要通过实时动态地决定采样布局来有效地监控基于部分观察到的数据的过程。在实践中,过程监控的另一个常见挑战是缺乏历史参考数据,这可能是由于过程/系统升级或设备更换造成的。为了解决这些关键挑战,本文提出了MASS (Monitoring with Adaptive Sampling under Self-starting scheme),这是一种新颖的自启动监测方法,专门用于在资源和历史参考数据有限的情况下监测高维数据流。我们的监测框架基于基于分位数的非参数CUSUM过程和基于似然比的统计,然后采用汤普森采样(TS)算法处理自启动场景下的部分观测数据。我们提出的方法的一个关键特征是它对失控分布和数据分位数的自适应估计,即使在参考数据有限的情况下,也能确保对任意和异构分布的数据流中的各种移位进行鲁棒检测。通过仿真实验和实际案例研究证明了该方法的优越性。从业人员注意事项——本文的动机是在只有有限资源(例如,有限的电力可用性、有限的传感器数量和有限的传输容量)和有限的历史控制参考数据可用的情况下,在线过程监控的关键挑战。例如,考虑这样一个场景:新建立的生产系统需要跨多个数据流进行在线监控。在这种情况下,通常缺乏对构建可靠的控制图至关重要的参考数据。此外,由于资源的限制,从与生产线相关的所有流中实时收集信息通常是不可行的。不同于以往的方法要么需要足够的参考数据量,要么需要完全可观察的数据流,本文提出了一种新的监测和动态采样方案,可以在少量参考数据的情况下有效地监测部分可观察的数据流。要实现该方法,需要:(i)通过使用少量参考数据估计分位数来启动该过程,(ii)确定在每个时间元中要观察哪些数据流,(iii)在在线监测期间自适应更新过程参数的估计,以及(iv)构建一组可用于快速实时检测系统异常的本地和全局统计数据。数值实验和现实世界的案例研究表明,与基准方法相比,我们提出的方法有效地利用可用数据来减少检测延迟并提高对各种移位的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
A Novel Likelihood Gradient-Based Incipient Fault Detection Approach for Avionics Systems Finite-time Adaptive FeedForward Fractional-order RISE α Control of an Actuated Ankle-Foot Orthosis Reinforcement Learning-Based Whole-Body Motion Control for Humanoids with Position-Controlled Joints Frame-level temporal action segmentation in nonhuman primates by fusing skeleton and visual modalities Nonsingular Generalized Adjustable Predefined-Time Sliding Mode Controllers with Adaptive Predefined-Time Observers for Nonlinear Dynamical Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1