Since the emergence of the novel COVID-19 virus pandemic in December 2019, numerous mathematical models were published to assess the transmission dynamics of the disease, predict its future course, and evaluate the impact of different control measures. The simplest models make the basic assumptions that individuals are perfectly and evenly mixed and have the same social structures. Such assumptions become problematic for large developing countries that aggregate heterogeneous COVID-19 outbreaks in local areas. Thus, this paper proposes a spatial SEIRDV model that includes spatial vaccination coverage, spatial vulnerability, and level of mobility, to take into account the spatial–temporal clustering pattern of COVID-19 cases. The conclusion of this study is that immunity, government interventions, infectiousness and virulence are the main drivers of the spread of COVID-19. These factors should be taken into consideration when scientists, public policy makers and other stakeholders in the health community analyse, create and project future disease prevention scenarios. Such a model has a place for disease outbreaks that may occur in future, allowing for the inclusion of vaccination rates in a spatial manner.
{"title":"A spatial model with vaccinations for COVID-19 in South Africa","authors":"Claudia Dresselhaus , Inger Fabris-Rotelli , Raeesa Manjoo-Docrat , Warren Brettenny , Jenny Holloway , Nada Abdelatif , Renate Thiede , Pravesh Debba , Nontembeko Dudeni-Tlhone","doi":"10.1016/j.spasta.2023.100792","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100792","url":null,"abstract":"<div><p>Since the emergence of the novel COVID-19 virus pandemic in December 2019, numerous mathematical models were published to assess the transmission dynamics of the disease, predict its future course, and evaluate the impact of different control measures. The simplest models make the basic assumptions that individuals are perfectly and evenly mixed and have the same social structures. Such assumptions become problematic for large developing countries that aggregate heterogeneous COVID-19 outbreaks in local areas. Thus, this paper proposes a spatial SEIRDV model that includes spatial vaccination coverage, spatial vulnerability, and level of mobility, to take into account the spatial–temporal clustering pattern of COVID-19 cases. The conclusion of this study is that immunity, government interventions, infectiousness and virulence are the main drivers of the spread of COVID-19. These factors should be taken into consideration when scientists, public policy makers and other stakeholders in the health community analyse, create and project future disease prevention scenarios. Such a model has a place for disease outbreaks that may occur in future, allowing for the inclusion of vaccination rates in a spatial manner.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675323000672/pdfft?md5=a0cf209eb8ab971cff4bc9c66e005417&pid=1-s2.0-S2211675323000672-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134832799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-07DOI: 10.1016/j.spasta.2023.100791
Yunquan Song, Yaqi Liu, Xiaodi Zhang, Yuanfeng Wang
Spatial data are widely used in various scenarios of life and are highly valued, and their analysis and research have achieved remarkable results. Spatial data have spatial effects and do not satisfy the assumption of independence; thus, the traditional econometric analysis methods cannot be directly used in spatial models, and the spatial autocorrelation and spatial heterogeneity of spatial data make the research more complicated and difficult. Generalized moment estimation(GMM) is a powerful tool for statistical modeling and inference of spatial data. Considering the case where there is a set of correctly specified moment conditions and another set of possibly misspecified moment conditions for spatial data, this paper proposes a GMM shrinkage method to estimate the unknown parameters for spatial autoregressive model with spatial autoregressive disturbances. The proposed GMM estimators are shown to enjoy oracle properties; i.e., it selects the valid moment conditions consistently from the candidate set and includes them into estimation automatically. The resulting estimator is asymptotically as efficient as the GMM estimator based on all valid moment conditions. Monte Carlo studies show that the method works well in terms of valid moment selection and the finite sample properties of its estimators.
{"title":"General spatial model meets adaptive shrinkage generalized moment estimation: Simultaneous model and moment selection","authors":"Yunquan Song, Yaqi Liu, Xiaodi Zhang, Yuanfeng Wang","doi":"10.1016/j.spasta.2023.100791","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100791","url":null,"abstract":"<div><p>Spatial data are widely used in various scenarios of life and are highly valued, and their analysis and research have achieved remarkable results. Spatial data have spatial effects and do not satisfy the assumption of independence; thus, the traditional econometric analysis methods cannot be directly used in spatial models, and the spatial autocorrelation and spatial heterogeneity of spatial data make the research more complicated and difficult. Generalized moment estimation(GMM) is a powerful tool for statistical modeling and inference of spatial data. Considering the case where there is a set of correctly specified moment conditions and another set of possibly misspecified moment conditions for spatial data, this paper proposes a GMM shrinkage method to estimate the unknown parameters for spatial autoregressive model with spatial autoregressive disturbances. The proposed GMM estimators are shown to enjoy oracle properties; i.e., it selects the valid moment conditions consistently from the candidate set and includes them into estimation automatically. The resulting estimator is asymptotically as efficient as the GMM estimator based on all valid moment conditions. Monte Carlo studies show that the method works well in terms of valid moment selection and the finite sample properties of its estimators.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-04DOI: 10.1016/j.spasta.2023.100790
Alan Ricardo da Silva, Marcos Douglas Rodrigues de Sousa
Poisson and Negative Binomial Regression Models are often used to describe the relationship between a count dependent variable and a set of independent variables. However, these models fail to analyze data with an excess of zeros, being Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models the most appropriate to fit this kind of data. To Incorporate the spatial dimension into the count data models, Geographically Weighted Poisson Regression (GWPR), Geographically Weighted Negative Binomial Regression (GWNBR) and Geographically Weighted Zero-Inflated Poisson Regression (GWZIPR) have been developed, but the zero-inflation part of the negative binomial distribution is undeveloped in order to incorporate the overdispersion and the excess of zeros, as was at the beginning of the COVID-19 pandemic, whereas some places were having an outbreak of cases and in others places, there were no cases yet. Therefore, we propose a Geographically Weighted Zero-Inflated Negative Binomial Regression (GWZINBR) model which can be considered a general case for count data, since locally it can become a GWZIPR, GWNBR or a GWPR model. We applied this model to simulated data and to the cases of COVID-19 in South Korea at the beginning of the pandemic in 2020 and the results showed a better understanding of the phenomenon compared to the GWNBR model.
{"title":"Geographically Weighted Zero-Inflated Negative Binomial Regression: A general case for count data","authors":"Alan Ricardo da Silva, Marcos Douglas Rodrigues de Sousa","doi":"10.1016/j.spasta.2023.100790","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100790","url":null,"abstract":"<div><p>Poisson and Negative Binomial Regression Models are often used to describe the relationship between a count dependent variable and a set of independent variables. However, these models fail to analyze data with an excess of zeros, being Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models the most appropriate to fit this kind of data. To Incorporate the spatial dimension into the count data models, Geographically Weighted Poisson Regression (GWPR), Geographically Weighted Negative Binomial Regression (GWNBR) and Geographically Weighted Zero-Inflated Poisson Regression (GWZIPR) have been developed, but the zero-inflation part of the negative binomial distribution is undeveloped in order to incorporate the overdispersion and the excess of zeros, as was at the beginning of the COVID-19 pandemic, whereas some places were having an outbreak of cases and in others places, there were no cases yet. Therefore, we propose a Geographically Weighted Zero-Inflated Negative Binomial Regression (GWZINBR) model which can be considered a general case for count data, since locally it can become a GWZIPR, GWNBR or a GWPR model. We applied this model to simulated data and to the cases of COVID-19 in South Korea at the beginning of the pandemic in 2020 and the results showed a better understanding of the phenomenon compared to the GWNBR model.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-31DOI: 10.1016/j.spasta.2023.100788
Patrick E. Brown
{"title":"Review of Sujit Sahu’s “Bayesian modeling of spatio-temporal data with R”","authors":"Patrick E. Brown","doi":"10.1016/j.spasta.2023.100788","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100788","url":null,"abstract":"","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91987264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-12DOI: 10.1016/j.spasta.2023.100786
Yu Shu, Jinwen Liang, Yaohua Rong, Zhenzhen Fu, Yi Yang
Ignoring potential spatial autocorrelation in georeferenced data may cause biased estimators. Furthermore, existing studies assume insufficiently flexible structure of spatial lag model for some practical applications, which makes it difficult to portray the complex relationship between responses and covariates. Thus, we propose a novel garrotized kernel machine estimation method for the nonparametric spatial lag model and develop an eigenvector spatial filtering algorithm with sparse regression to filter spatial autocorrelation out of the residuals. The “one-group-at-a-time” cyclical coordinate descent algorithm is introduced for a solution path of tuning parameters. Our method can better describe the potential nonlinear relationship between responses and covariates, making it possible to model high-order interaction effects among covariates. Numerical results and the analysis of commodity residential house prices in large and medium-sized Chinese cities indicate that the proposed method achieves better prediction performance compared with competing ones. The result of real data analysis can provide guidance for the government to take targeted suppression measures of house prices for different areas.
{"title":"A more accurate estimation with kernel machine for nonparametric spatial lag models","authors":"Yu Shu, Jinwen Liang, Yaohua Rong, Zhenzhen Fu, Yi Yang","doi":"10.1016/j.spasta.2023.100786","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100786","url":null,"abstract":"<div><p><span><span><span>Ignoring potential spatial autocorrelation in georeferenced data may cause </span>biased estimators. Furthermore, existing studies assume insufficiently flexible structure of spatial lag model for some practical applications, which makes it difficult to portray the complex relationship between responses and </span>covariates<span>. Thus, we propose a novel garrotized kernel machine estimation method for the nonparametric spatial lag model and develop an eigenvector </span></span>spatial filtering<span> algorithm with sparse regression to filter spatial autocorrelation out of the residuals. The “one-group-at-a-time” cyclical coordinate descent algorithm is introduced for a solution path of tuning parameters. Our method can better describe the potential nonlinear relationship between responses and covariates, making it possible to model high-order interaction effects among covariates. Numerical results and the analysis of commodity residential house prices in large and medium-sized Chinese cities indicate that the proposed method achieves better prediction performance compared with competing ones. The result of real data analysis can provide guidance for the government to take targeted suppression measures of house prices for different areas.</span></p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49716106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-12DOI: 10.1016/j.spasta.2023.100787
Kesen Wang , Sameh Abdulah , Ying Sun , Marc G. Genton
The Matérn family of covariance functions is currently the most popularly used model in spatial statistics, geostatistics, and machine learning to specify the correlation between two geographical locations based on spatial distance. Compared to existing covariance functions, the Matérn family has more flexibility in data fitting because it allows the control of the field smoothness through a dedicated parameter. Moreover, it generalizes other popular covariance functions. However, fitting the smoothness parameter is computationally challenging since it complicates the optimization process. As a result, some practitioners set the smoothness parameter at an arbitrary value to reduce the optimization convergence time. In the literature, studies have used various parameterizations of the Matérn covariance function, assuming they are equivalent. This work aims at studying the effectiveness of different parameterizations under various settings. We demonstrate the feasibility of inferring all parameters simultaneously and quantifying their uncertainties on large-scale data using the ExaGeoStat parallel software. We also highlight the importance of the smoothness parameter by analyzing the Fisher information of the statistical parameters. We show that the various parameterizations have different properties and differ from several perspectives. In particular, we study the three most popular parameterizations in terms of parameter estimation accuracy, modeling accuracy and efficiency, prediction efficiency, uncertainty quantification, and asymptotic properties. We further demonstrate their differing performances under nugget effects and approximated covariance. Lastly, we give recommendations for parameterization selection based on our experimental results.
{"title":"Which parameterization of the Matérn covariance function?","authors":"Kesen Wang , Sameh Abdulah , Ying Sun , Marc G. Genton","doi":"10.1016/j.spasta.2023.100787","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100787","url":null,"abstract":"<div><p><span><span>The Matérn family of covariance functions is currently the most popularly used model in spatial </span>statistics, geostatistics, and machine learning to specify the correlation between two geographical locations based on spatial distance. Compared to existing covariance functions, the Matérn family has more flexibility in data fitting because it allows the control of the field smoothness through a dedicated parameter. Moreover, it generalizes other popular covariance functions. However, fitting the smoothness parameter is computationally challenging since it complicates the optimization process. As a result, some practitioners set the smoothness parameter at an arbitrary value to reduce the optimization convergence time. In the literature, studies have used various parameterizations of the Matérn covariance function, assuming they are equivalent. This work aims at studying the effectiveness of different parameterizations under various settings. We demonstrate the feasibility of inferring all parameters simultaneously and quantifying their uncertainties on large-scale data using the </span><em>ExaGeoStat</em><span><span><span> parallel software. We also highlight the importance of the smoothness parameter by analyzing the Fisher information of the statistical parameters. We show that the various parameterizations have different properties and differ from several perspectives. In particular, we study the three most popular parameterizations in terms of parameter estimation accuracy, modeling accuracy and efficiency, prediction efficiency, </span>uncertainty quantification, and </span>asymptotic properties. We further demonstrate their differing performances under nugget effects and approximated covariance. Lastly, we give recommendations for parameterization selection based on our experimental results.</span></p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49716104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-06DOI: 10.1016/j.spasta.2023.100785
Ezra Gayawan , Osafu Augustine Egbon
Studies have shown that stunting and wasting indicators are strongly correlated among children, with the potential of concurrently affecting their physical and cognitive development. However, the identification of subpopulations of children with varying risks of stunting and wasting could be valuable for targeted intervention. This work proposed a bivariate spatio-temporal mixture model within a Bayesian framework to describe the spatial behavior of subpopulations of the children within the wider population of children under five years of age in Nigeria. The model assumes that each sub-population follows a Gaussian distribution, and therefore, the overall population is modeled by combining Gaussian sub-spatial models probabilistically. Inferences were based on the Markov chain Monte Carlo algorithm, that draw samples from the joint posterior distribution. The model was applied to data from four waves of the Nigerian Demographic and Health Survey. We identified a significant negative correlation between stunting and wasting among subpopulations with a negative spatial correlation between the spatial patterns of both illnesses. The findings demonstrate varying risk factors between the subpopulations with an evidence of spatio-temporal disparity in the likelihood of stunting and wasting. The findings underscore the need for a comprehensive national intervention program with attention given to high-burden states in a manner that involves communities and subpopulations. The maps could serve as a valuable tool for intervention planning.
{"title":"Spatio-temporal mapping of stunting and wasting in Nigerian children: A bivariate mixture modeling","authors":"Ezra Gayawan , Osafu Augustine Egbon","doi":"10.1016/j.spasta.2023.100785","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100785","url":null,"abstract":"<div><p>Studies have shown that stunting and wasting indicators are strongly correlated among children, with the potential of concurrently affecting their physical and cognitive development. However, the identification of subpopulations of children with varying risks of stunting and wasting could be valuable for targeted intervention. This work proposed a bivariate<span> spatio-temporal mixture model within a Bayesian<span> framework to describe the spatial behavior of subpopulations of the children within the wider population of children under five years of age in Nigeria. The model assumes that each sub-population follows a Gaussian distribution<span><span>, and therefore, the overall population is modeled by combining Gaussian sub-spatial models probabilistically. Inferences were based on the Markov chain Monte Carlo<span> algorithm, that draw samples from the joint posterior distribution. The model was applied to data from four waves of the Nigerian Demographic and Health Survey. We identified a significant negative correlation between stunting and wasting among subpopulations with a negative </span></span>spatial correlation between the spatial patterns of both illnesses. The findings demonstrate varying risk factors between the subpopulations with an evidence of spatio-temporal disparity in the likelihood of stunting and wasting. The findings underscore the need for a comprehensive national intervention program with attention given to high-burden states in a manner that involves communities and subpopulations. The maps could serve as a valuable tool for intervention planning.</span></span></span></p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49734478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.spasta.2023.100775
Thomas Suesse , Alexander Brenning , Veronika Grupp
Linear Discriminant Analysis (LDA) is a popular and simple classification tool that often outperforms more sophisticated modern machine learning techniques in remote sensing. We introduce a novel LDA method that uses spatial autocorrelation of all pixels of an object to be classified but also of other objects of the training set that are spatially close to improve classification performance. To simplify spatial modelling and model fitting, the methodology is applied to the transformed feature vectors. We term this method conditional spatial LDA. Much alike universal Kriging in geostatistical interpolation, the combined use of feature data and conditioning on labelled training data in conditional spatial LDA was best able to exploit the available geospatial data. The method is illustrated on a crop classification case study from the Aconcagua agricultural region in central Chile.
{"title":"Spatial linear discriminant analysis approaches for remote-sensing classification","authors":"Thomas Suesse , Alexander Brenning , Veronika Grupp","doi":"10.1016/j.spasta.2023.100775","DOIUrl":"10.1016/j.spasta.2023.100775","url":null,"abstract":"<div><p>Linear Discriminant Analysis (LDA) is a popular and simple classification tool that often outperforms more sophisticated modern machine learning techniques in remote sensing. We introduce a novel LDA method that uses spatial autocorrelation of all pixels of an object to be classified but also of other objects of the training set that are spatially close to improve classification performance. To simplify spatial modelling and model fitting, the methodology is applied to the transformed feature vectors. We term this method conditional spatial LDA. Much alike universal Kriging in geostatistical interpolation, the combined use of feature data and conditioning on labelled training data in conditional spatial LDA was best able to exploit the available geospatial data. The method is illustrated on a crop classification case study from the Aconcagua agricultural region in central Chile.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47408768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.spasta.2023.100773
Pratik Nag , Ying Sun , Brian J. Reich
Gaussian processes (GP) and Kriging are widely used in traditional spatio-temporal modelling and prediction. These techniques typically presuppose that the data are observed from a stationary GP with a parametric covariance structure. However, processes in real-world applications often exhibit non-Gaussianity and nonstationarity. Moreover, likelihood-based inference for GPs is computationally expensive and thus prohibitive for large datasets. In this paper, we propose a deep neural network (DNN) based two-stage model for spatio-temporal interpolation and forecasting. Interpolation is performed in the first step, which utilizes a dependent DNN with the embedding layer constructed with spatio-temporal basis functions. For the second stage, we use Long-Short Term Memory (LSTM) and convolutional LSTM to forecast future observations at a given location. We adopt the quantile-based loss function in the DNN to provide probabilistic forecasting. Compared to Kriging, the proposed method does not require specifying covariance functions or making stationarity assumptions and is computationally efficient. Therefore, it is suitable for large-scale prediction of complex spatio-temporal processes. We apply our method to monthly data at more than 200,000 space–time locations from January 1999 to December 2022 for fast imputation of missing values and forecasts with uncertainties.
{"title":"Spatio-temporal DeepKriging for interpolation and probabilistic forecasting","authors":"Pratik Nag , Ying Sun , Brian J. Reich","doi":"10.1016/j.spasta.2023.100773","DOIUrl":"https://doi.org/10.1016/j.spasta.2023.100773","url":null,"abstract":"<div><p><span><span><span><span>Gaussian processes (GP) and Kriging are widely used in traditional spatio-temporal modelling and prediction. These techniques typically presuppose that the data are observed from a </span>stationary GP<span> with a parametric<span> covariance structure<span>. However, processes in real-world applications often exhibit non-Gaussianity and nonstationarity. Moreover, likelihood-based inference for GPs is computationally expensive and thus prohibitive for large datasets. In this paper, we propose a deep </span></span></span></span>neural network<span> (DNN) based two-stage model for spatio-temporal interpolation and forecasting. Interpolation is performed in the first step, which utilizes a dependent DNN with the embedding layer constructed with spatio-temporal basis functions. For the second stage, we use Long-Short Term Memory (LSTM) and convolutional LSTM to forecast future observations at a given location. We adopt the quantile-based loss function in the DNN to provide probabilistic forecasting. Compared to Kriging, the proposed method does not require specifying covariance functions or making </span></span>stationarity assumptions and is computationally efficient. Therefore, it is suitable for large-scale prediction of complex spatio-temporal processes. We apply our method to monthly </span><span><math><mrow><mi>P</mi><msub><mrow><mi>M</mi></mrow><mrow><mn>2</mn><mo>.</mo><mn>5</mn></mrow></msub></mrow></math></span> data at more than 200,000 space–time locations from January 1999 to December 2022 for fast imputation of missing values and forecasts with uncertainties.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49733114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}