Prediction for Big Data Through Kriging: Small Sequential and One-Shot Designs

Q3 Business, Management and Accounting American Journal of Mathematical and Management Sciences Pub Date : 2020-01-30 DOI:10.1080/01966324.2020.1716281
J. Kleijnen, Wim C. M. van Beers
{"title":"Prediction for Big Data Through Kriging: Small Sequential and One-Shot Designs","authors":"J. Kleijnen, Wim C. M. van Beers","doi":"10.1080/01966324.2020.1716281","DOIUrl":null,"url":null,"abstract":"Abstract Kriging—or Gaussian process (GP) modeling—is an interpolation method assuming that the outputs (responses) are more correlated, as the inputs (explanatory or independent variables) are closer. Such a GP has unknown (hyper)parameters that are usually estimated through the maximum-likelihood method. Big data, however, make it problematic to compute these estimated parameters, and the corresponding Kriging predictor and its predictor variance. To solve this problem, some authors select a relatively small subset from the big set of previously observed “old” data. These selection methods are sequential, and they depend on the variance of the Kriging predictor; this variance requires a specific Kriging model and the estimation of its parameters. The resulting designs turn out to be “local”; i.e., most selected old input combinations are concentrated around the new combination to be predicted. We develop a simpler one-shot (fixed-sample, non-sequential) design; i.e., from the big data set we select a small subset with the nearest neighbors of the new combination. To compare our designs and the sequential designs empirically, we use the squared prediction errors, in several numerical experiments. These experiments show that our design may yield reasonable performance.","PeriodicalId":35850,"journal":{"name":"American Journal of Mathematical and Management Sciences","volume":"39 1","pages":"199 - 213"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/01966324.2020.1716281","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Mathematical and Management Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/01966324.2020.1716281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Business, Management and Accounting","Score":null,"Total":0}
引用次数: 13

Abstract

Abstract Kriging—or Gaussian process (GP) modeling—is an interpolation method assuming that the outputs (responses) are more correlated, as the inputs (explanatory or independent variables) are closer. Such a GP has unknown (hyper)parameters that are usually estimated through the maximum-likelihood method. Big data, however, make it problematic to compute these estimated parameters, and the corresponding Kriging predictor and its predictor variance. To solve this problem, some authors select a relatively small subset from the big set of previously observed “old” data. These selection methods are sequential, and they depend on the variance of the Kriging predictor; this variance requires a specific Kriging model and the estimation of its parameters. The resulting designs turn out to be “local”; i.e., most selected old input combinations are concentrated around the new combination to be predicted. We develop a simpler one-shot (fixed-sample, non-sequential) design; i.e., from the big data set we select a small subset with the nearest neighbors of the new combination. To compare our designs and the sequential designs empirically, we use the squared prediction errors, in several numerical experiments. These experiments show that our design may yield reasonable performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过克里格预测大数据:小序列和一次性设计
摘要克里格(Kriging)或高斯过程(GP)建模是一种插值方法,假设随着输入(解释变量或自变量)的接近,输出(响应)的相关性更强。这样的GP具有未知(超)参数,这些参数通常通过最大似然法来估计。然而,大数据使得计算这些估计参数以及相应的克里格预测器及其预测器方差成为问题。为了解决这个问题,一些作者从之前观察到的“旧”数据的大集合中选择了一个相对较小的子集。这些选择方法是顺序的,并且它们取决于克里格预测器的方差;这种方差需要特定的克里格模型及其参数的估计。由此产生的设计结果是“局部的”;即大多数选择的旧输入组合集中在要预测的新组合周围。我们开发了一种更简单的一次性(固定样本,非顺序)设计;即,从大数据集中,我们选择具有新组合的最近邻居的子集。为了从经验上比较我们的设计和顺序设计,我们在几个数值实验中使用了预测误差的平方。这些实验表明,我们的设计可能产生合理的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
American Journal of Mathematical and Management Sciences
American Journal of Mathematical and Management Sciences Business, Management and Accounting-Business, Management and Accounting (all)
CiteScore
2.70
自引率
0.00%
发文量
5
期刊最新文献
The Unit Omega Distribution, Properties and Its Application Classical and Bayesian Inference of Unit Gompertz Distribution Based on Progressively Type II Censored Data An Alternative Discrete Analogue of the Half-Logistic Distribution Based on Minimization of a Distance between Cumulative Distribution Functions Classical and Bayes Analyses of Autoregressive Model with Heavy-Tailed Error Testing on the Quantiles of a Single Normal Population in the Presence of Several Normal Populations with a Common Variance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1