Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva
{"title":"基于元学习的神经网络和LSTM的单变量时间序列缺失数据输入","authors":"Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva","doi":"10.1016/j.asoc.2025.112845","DOIUrl":null,"url":null,"abstract":"<div><div>Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"172 ","pages":"Article 112845"},"PeriodicalIF":6.6000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A meta-learning based neural network and LSTM for univariate time series missing data imputation\",\"authors\":\"Mauricio Morais Almeida, João Dallyson Sousa Almeida, Darlan Bruno Pontes Quintanilha, Geraldo Braz Júnior, Aristófanes Correa Silva\",\"doi\":\"10.1016/j.asoc.2025.112845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"172 \",\"pages\":\"Article 112845\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625001565\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/10 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625001565","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A meta-learning based neural network and LSTM for univariate time series missing data imputation
Time series are regularly collected data that describe the average evolution of an event over time, making them increasingly relevant in areas such as business, natural sciences and medicine. A major challenge related to time series is data loss, and several approaches have been developed to recover missing values in univariate time series (UTS). This work aims to improve the imputation of missing data in univariate and heterogeneous time series. Thus, we built a diverse database covering different time series domains and selected a set of data imputation techniques. The results show that imputation in time series is challenging, especially due to the variability of the series, the position of missing data and the number of samples passed to each technique. The HybridLSTM network, developed in this study, proved effective in recommending the most suitable imputation techniques for each series, resulting in a lower average error than using a single technique or recent techniques such as Pix2Pix and Moment. In addition, adopting a hybrid loss function, which considers multi-class and multi-label tasks, contributed to optimal or near-optimal performance, even in cases where the ideal was not achieved. These advances were possible thanks to the efficient but simple construction of metadata and the innovative approach of locally combining several imputation techniques within the same series. We observed that meta-learning has great potential to be applied in real contexts where the ideal technique is not previously known and the data has not been pre-treated in terms of data values. Moreover, as our experiments were very close to this context, it became useful, as the model performed very close to the ideal, validating the applicability of the adaptive meta-learning approach to optimize the imputation of missing data in real contexts.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.