Missing Data Imputation for Multivariate Time series in Industrial IoT: A Federated Learning Approach

2022 IEEE 20th International Conference on Industrial Informatics (INDIN) Pub Date : 2022-07-25 DOI:10.1109/INDIN51773.2022.9976093

A. Gkillas, A. Lalos

{"title":"Missing Data Imputation for Multivariate Time series in Industrial IoT: A Federated Learning Approach","authors":"A. Gkillas, A. Lalos","doi":"10.1109/INDIN51773.2022.9976093","DOIUrl":null,"url":null,"abstract":"In multidimensional times series generated by sensor recordings of multiple dispersed IoT edge devices, missing measurements are commonplace due to sensing or communication failures, considered a thorny and perplexing problem in a wide range of Industry 4.0 applications. Existing studies for time series imputation focus on developing centralized deep learning approaches, which require massive amounts of data to be uploaded to a central server with adequate computational and power resources for the training of the models, since these approaches are unsuitable for edge and IoT devices characterized by limited computation resources. Different from the current literature, in this study, the time series imputation problem is studied from a federated learning perspective, which is able to surmount the above difficulties. In particular, a novel federated learning approach is proposed, assuming different IoT devices with varying sensing and computational capabilities, that trade-off accuracy with computational/communication/sensing complexity and minimize the operations that need to be performed during training and inferences phase. Furthermore, considering that the main computations are performed on the edge, where the IoT edge devices have limited computational capabilities and power resources, a lightweight yet effective autoencoder-based model is employed to address the examined problem, modified properly to capture the temporal dependencies of the time series data. Extensive evaluation studies with two open datasets have shown that both approaches minimize the data exchanges the need to be made for outperforming centralized approaches in the presence of limited training data.","PeriodicalId":359190,"journal":{"name":"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN51773.2022.9976093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In multidimensional times series generated by sensor recordings of multiple dispersed IoT edge devices, missing measurements are commonplace due to sensing or communication failures, considered a thorny and perplexing problem in a wide range of Industry 4.0 applications. Existing studies for time series imputation focus on developing centralized deep learning approaches, which require massive amounts of data to be uploaded to a central server with adequate computational and power resources for the training of the models, since these approaches are unsuitable for edge and IoT devices characterized by limited computation resources. Different from the current literature, in this study, the time series imputation problem is studied from a federated learning perspective, which is able to surmount the above difficulties. In particular, a novel federated learning approach is proposed, assuming different IoT devices with varying sensing and computational capabilities, that trade-off accuracy with computational/communication/sensing complexity and minimize the operations that need to be performed during training and inferences phase. Furthermore, considering that the main computations are performed on the edge, where the IoT edge devices have limited computational capabilities and power resources, a lightweight yet effective autoencoder-based model is employed to address the examined problem, modified properly to capture the temporal dependencies of the time series data. Extensive evaluation studies with two open datasets have shown that both approaches minimize the data exchanges the need to be made for outperforming centralized approaches in the presence of limited training data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

工业物联网中多元时间序列缺失数据的输入:一种联邦学习方法

在由多个分散物联网边缘设备的传感器记录生成的多维时间序列中，由于传感或通信故障而丢失测量是常见的，这在广泛的工业4.0应用中被认为是一个棘手而令人困惑的问题。现有的时间序列插值研究侧重于开发集中式深度学习方法，这些方法需要将大量数据上传到具有足够计算和功率资源的中央服务器上以进行模型的训练，因为这些方法不适合计算资源有限的边缘和物联网设备。与现有文献不同的是，本研究从联邦学习的角度研究时间序列的imputation问题，能够克服上述困难。特别是，提出了一种新的联邦学习方法，假设具有不同传感和计算能力的不同物联网设备，权衡计算/通信/传感复杂性的准确性，并最大限度地减少在训练和推理阶段需要执行的操作。此外，考虑到主要计算是在边缘执行的，而物联网边缘设备的计算能力和功率资源有限，因此采用轻量级但有效的基于自编码器的模型来解决所检查的问题，并进行适当修改以捕获时间序列数据的时间依赖性。使用两个开放数据集进行的广泛评估研究表明，这两种方法都最大限度地减少了在有限训练数据的情况下优于集中式方法所需的数据交换。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE 20th International Conference on Industrial Informatics (INDIN)

自引率

0.00%

发文量