IF 2.6 4区 计算机科学Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONSBig DataPub Date : 2024-04-01Epub Date: 2023-02-24DOI:10.1089/big.2022.0155
Suyel Namasudra, S Dhamodharavadhani, R Rathipriya, Ruben Gonzalez Crespo, Nageswara Rao Moparthi
{"title":"基于神经网络的大数据单变量时间序列预测模型。","authors":"Suyel Namasudra, S Dhamodharavadhani, R Rathipriya, Ruben Gonzalez Crespo, Nageswara Rao Moparthi","doi":"10.1089/big.2022.0155","DOIUrl":null,"url":null,"abstract":"<p><p>Big data is a combination of large structured, semistructured, and unstructured data collected from various sources that must be processed before using them in many analytical applications. Anomalies or inconsistencies in big data refer to the occurrences of some data that are in some way unusual and do not fit the general patterns. It is considered one of the major problems of big data. Data trust method (DTM) is a technique used to identify and replace anomaly or untrustworthy data using the interpolation method. This article discusses the DTM used for univariate time series (UTS) forecasting algorithms for big data, which is considered the preprocessing approach by using a neural network (NN) model. In this work, DTM is the combination of statistical-based untrustworthy data detection method and statistical-based untrustworthy data replacement method, and it is used to improve the forecast quality of UTS. In this study, an enhanced NN model has been proposed for big data that incorporates DTMs with the NN-based UTS forecasting model. The coefficient variance root mean squared error is utilized as the main characteristic indicator in the proposed work to choose the best UTS data for model development. The results show the effectiveness of the proposed method as it can improve the prediction process by determining and replacing the untrustworthy big data.</p>","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"83-99"},"PeriodicalIF":2.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced Neural Network-Based Univariate Time-Series Forecasting Model for Big Data.\",\"authors\":\"Suyel Namasudra, S Dhamodharavadhani, R Rathipriya, Ruben Gonzalez Crespo, Nageswara Rao Moparthi\",\"doi\":\"10.1089/big.2022.0155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Big data is a combination of large structured, semistructured, and unstructured data collected from various sources that must be processed before using them in many analytical applications. Anomalies or inconsistencies in big data refer to the occurrences of some data that are in some way unusual and do not fit the general patterns. It is considered one of the major problems of big data. Data trust method (DTM) is a technique used to identify and replace anomaly or untrustworthy data using the interpolation method. This article discusses the DTM used for univariate time series (UTS) forecasting algorithms for big data, which is considered the preprocessing approach by using a neural network (NN) model. In this work, DTM is the combination of statistical-based untrustworthy data detection method and statistical-based untrustworthy data replacement method, and it is used to improve the forecast quality of UTS. In this study, an enhanced NN model has been proposed for big data that incorporates DTMs with the NN-based UTS forecasting model. The coefficient variance root mean squared error is utilized as the main characteristic indicator in the proposed work to choose the best UTS data for model development. The results show the effectiveness of the proposed method as it can improve the prediction process by determining and replacing the untrustworthy big data.</p>\",\"PeriodicalId\":51314,\"journal\":{\"name\":\"Big Data\",\"volume\":\" \",\"pages\":\"83-99\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1089/big.2022.0155\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/2/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1089/big.2022.0155","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
大数据是从各种来源收集的大量结构化、半结构化和非结构化数据的组合,在许多分析应用中使用这些数据之前必须对其进行处理。大数据中的异常或不一致是指某些数据在某种程度上不寻常,不符合一般模式。它被认为是大数据的主要问题之一。数据信任方法(DTM)是一种使用插值法识别和替换异常或不可信数据的技术。本文讨论了用于大数据单变量时间序列(UTS)预测算法的 DTM,它被认为是使用神经网络(NN)模型的预处理方法。在这项工作中,DTM 是基于统计的不可信数据检测方法和基于统计的不可信数据替换方法的组合,用于提高 UTS 的预测质量。本研究提出了一种针对大数据的增强型 NN 模型,将 DTM 与基于 NN 的UTS 预测模型相结合。该模型以系数方差均方根误差为主要特征指标,选择最佳的UTS数据进行模型开发。结果表明了所提方法的有效性,因为它可以通过确定和替换不可信的大数据来改进预测过程。
Enhanced Neural Network-Based Univariate Time-Series Forecasting Model for Big Data.
Big data is a combination of large structured, semistructured, and unstructured data collected from various sources that must be processed before using them in many analytical applications. Anomalies or inconsistencies in big data refer to the occurrences of some data that are in some way unusual and do not fit the general patterns. It is considered one of the major problems of big data. Data trust method (DTM) is a technique used to identify and replace anomaly or untrustworthy data using the interpolation method. This article discusses the DTM used for univariate time series (UTS) forecasting algorithms for big data, which is considered the preprocessing approach by using a neural network (NN) model. In this work, DTM is the combination of statistical-based untrustworthy data detection method and statistical-based untrustworthy data replacement method, and it is used to improve the forecast quality of UTS. In this study, an enhanced NN model has been proposed for big data that incorporates DTMs with the NN-based UTS forecasting model. The coefficient variance root mean squared error is utilized as the main characteristic indicator in the proposed work to choose the best UTS data for model development. The results show the effectiveness of the proposed method as it can improve the prediction process by determining and replacing the untrustworthy big data.
Big DataCOMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS
CiteScore
9.10
自引率
2.20%
发文量
60
期刊介绍:
Big Data is the leading peer-reviewed journal covering the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions.
Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government.
Big Data coverage includes:
Big data industry standards,
New technologies being developed specifically for big data,
Data acquisition, cleaning, distribution, and best practices,
Data protection, privacy, and policy,
Business interests from research to product,
The changing role of business intelligence,
Visualization and design principles of big data infrastructures,
Physical interfaces and robotics,
Social networking advantages for Facebook, Twitter, Amazon, Google, etc,
Opportunities around big data and how companies can harness it to their advantage.