基于元学习的神经网络和LSTM的单变量时间序列缺失数据输入

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Applied Soft Computing Pub Date : 2025-03-01 Epub Date: 2025-02-10 DOI:10.1016/j.asoc.2025.112845

Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva

{"title":"基于元学习的神经网络和LSTM的单变量时间序列缺失数据输入","authors":"Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva","doi":"10.1016/j.asoc.2025.112845","DOIUrl":null,"url":null,"abstract":"<div><div>Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112845"},"PeriodicalIF":6.6000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A meta-learning based neural network and LSTM for univariate time series missing data imputation\",\"authors\":\"Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva\",\"doi\":\"10.1016/j.asoc.2025.112845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"172 \",\"pages\":\"Article 112845\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625001565\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/10 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625001565","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

时间序列是定期收集的数据，描述事件随时间的平均演变，使其在商业、自然科学和医学等领域的相关性越来越强。与时间序列相关的一个主要挑战是数据丢失，已经开发了几种方法来恢复单变量时间序列（UTS）中的缺失值。本工作旨在改进单变量和异构时间序列中缺失数据的输入。因此，我们建立了一个涵盖不同时间序列域的多样化数据库，并选择了一套数据插入技术。结果表明，时间序列的插值具有挑战性，特别是由于序列的可变性、缺失数据的位置以及传递给每种技术的样本数量。在本研究中开发的HybridLSTM网络在为每个序列推荐最合适的插入技术方面被证明是有效的，与使用单一技术或Pix2Pix和Moment等最新技术相比，其平均误差更低。此外，采用考虑多类别和多标签任务的混合损失函数，即使在没有达到理想的情况下，也有助于实现最优或接近最优的性能。这些进步之所以成为可能，要归功于高效而简单的元数据构建，以及在同一序列中本地结合几种imputation技术的创新方法。我们观察到，元学习在现实环境中具有很大的应用潜力，在这些环境中，理想的技术以前是未知的，数据也没有根据数据值进行预处理。此外，由于我们的实验非常接近此上下文，因此它变得有用，因为模型的表现非常接近理想，验证了自适应元学习方法在优化真实上下文中缺失数据的输入方面的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A meta-learning based neural network and LSTM for univariate time series missing data imputation

Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.