Pub Date : 2024-04-01DOI: 10.1016/j.spasta.2024.100823
Philipp Otto
This paper introduces a multivariate spatiotemporal autoregressive conditional heteroscedasticity (ARCH) model based on a vec-representation. The model includes instantaneous spatial autoregressive spill-over effects, as they are usually present in geo-referenced data. Furthermore, spatial and temporal cross-variable effects in the conditional variance are explicitly modelled. We transform the model to a multivariate spatiotemporal autoregressive model using a log-squared transformation and derive a consistent quasi-maximum-likelihood estimator (QMLE). For finite samples and different error distributions, the performance of the QMLE is analysed in a series of Monte-Carlo simulations. In addition, we illustrate the practical usage of the new model with a real-world example. We analyse the monthly real-estate price returns for three different property types in Berlin from 2002 to 2014. We find weak (instantaneous) spatial interactions, while the temporal autoregressive structure in the market risks is of higher importance. Interactions between the different property types only occur in the temporally lagged variables. Thus, we see mainly temporal volatility clusters and weak spatial volatility spillovers.
{"title":"A multivariate spatial and spatiotemporal ARCH Model","authors":"Philipp Otto","doi":"10.1016/j.spasta.2024.100823","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100823","url":null,"abstract":"<div><p>This paper introduces a multivariate spatiotemporal autoregressive conditional heteroscedasticity (ARCH) model based on a vec-representation. The model includes instantaneous spatial autoregressive spill-over effects, as they are usually present in geo-referenced data. Furthermore, spatial and temporal cross-variable effects in the conditional variance are explicitly modelled. We transform the model to a multivariate spatiotemporal autoregressive model using a log-squared transformation and derive a consistent quasi-maximum-likelihood estimator (QMLE). For finite samples and different error distributions, the performance of the QMLE is analysed in a series of Monte-Carlo simulations. In addition, we illustrate the practical usage of the new model with a real-world example. We analyse the monthly real-estate price returns for three different property types in Berlin from 2002 to 2014. We find weak (instantaneous) spatial interactions, while the temporal autoregressive structure in the market risks is of higher importance. Interactions between the different property types only occur in the temporally lagged variables. Thus, we see mainly temporal volatility clusters and weak spatial volatility spillovers.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000149/pdfft?md5=eb8563b57f62dc0654997c6b2209f850&pid=1-s2.0-S2211675324000149-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140536778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatality arising from violent events is a critical public health problem in Africa. Although numerous studies on crime and violent events have been conducted, adequate attention has not been given to the distribution of fatalities arising from these events. This study unraveled the spatio-temporal pattern of fatality from violent events in Western and Central Africa. A two-component spatio-temporal zero-inflated model on a continuous spatial domain within a Bayesian framework was adopted. The stochastic partial differential equation was used to quantify the continuous pattern and make projections in unsampled regions. Fatality data from 1997 to 2021 was obtained from the Armed Conflict Location and Event Data Project (ACLED). Findings from the result revealed a spatial and temporal divide in the prevalence of fatality in the study region. Between the years 1997 and 2010, fatality from violence was most prevalent in Central Africa, whereas in more recent years, it was most prevalent in Western Africa. The posterior predictive probabilities of fatality occurrence due to violent events in Nigeria and Cameroon were highest and above 0.6, and the probability of more than one death per violent event is highest in Angola and Chad with probability 0.2. On violent event type, findings showed that suicide bombs had the highest likelihood of fatality occurrence whereas the event of violent non-state actors overtaking territory had the highest impact on the likelihood of multiple fatality counts. Among the armed actors who participated in violent events, armed religious groups were linked to the highest likelihood of fatality occurrence whereas Military forces were linked to the highest likelihood of multiple fatality counts per event. The finding also revealed that there is a higher likelihood of multiple fatalities in the Winter temperate season. These findings could be used for planning and policy design geared towards mitigating fatality and providing a guide towards resource distribution to support the affected communities.
{"title":"Bayesian spatio-temporal statistical modeling of violent-related fatality in western and central Africa","authors":"Osafu Augustine Egbon , Asrat Mekonnen Belachew , Mariella Ananias Bogoni , Bayowa Teniola Babalola , Francisco Louzada","doi":"10.1016/j.spasta.2024.100828","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100828","url":null,"abstract":"<div><p>Fatality arising from violent events is a critical public health problem in Africa. Although numerous studies on crime and violent events have been conducted, adequate attention has not been given to the distribution of fatalities arising from these events. This study unraveled the spatio-temporal pattern of fatality from violent events in Western and Central Africa. A two-component spatio-temporal zero-inflated model on a continuous spatial domain within a Bayesian framework was adopted. The stochastic partial differential equation was used to quantify the continuous pattern and make projections in unsampled regions. Fatality data from 1997 to 2021 was obtained from the Armed Conflict Location and Event Data Project (ACLED). Findings from the result revealed a spatial and temporal divide in the prevalence of fatality in the study region. Between the years 1997 and 2010, fatality from violence was most prevalent in Central Africa, whereas in more recent years, it was most prevalent in Western Africa. The posterior predictive probabilities of fatality occurrence due to violent events in Nigeria and Cameroon were highest and above 0.6, and the probability of more than one death per violent event is highest in Angola and Chad with probability 0.2. On violent event type, findings showed that suicide bombs had the highest likelihood of fatality occurrence whereas the event of violent non-state actors overtaking territory had the highest impact on the likelihood of multiple fatality counts. Among the armed actors who participated in violent events, armed religious groups were linked to the highest likelihood of fatality occurrence whereas Military forces were linked to the highest likelihood of multiple fatality counts per event. The finding also revealed that there is a higher likelihood of multiple fatalities in the Winter temperate season. These findings could be used for planning and policy design geared towards mitigating fatality and providing a guide towards resource distribution to support the affected communities.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140309731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-05DOI: 10.1016/j.spasta.2024.100818
Carolina Euán , Ying Sun , Brian J. Reich
In this paper, we propose a new regime-based model to describe spatio-temporal dynamics of precipitation data. Precipitation is one of the most essential factors for multiple human-related activities such as agriculture production. Therefore, a detailed and accurate understanding of the rain for a given region is needed. Motivated by the different formations of precipitation systems (convective, frontal, and orographic), we proposed a hierarchical regime-based spatio-temporal model for precipitation data. We use information about the values of neighboring sites to identify such regimes, allowing spatial and temporal dependence to be different among regimes. Using the Bayesian approach with R INLA, we fit our model to the Guanajuato state (Mexico) precipitation data case study to understand the spatial and temporal dependencies of precipitation in this region. Our findings show the regime-based model’s versatility and compare it with the truncated Gaussian model.
在本文中,我们提出了一种新的基于系统的模型来描述降水数据的时空动态。降水是农业生产等多种人类相关活动最基本的因素之一。因此,需要详细、准确地了解特定地区的降雨情况。受降水系统不同形态(对流、锋面和地貌)的启发,我们提出了一种基于系统的降水数据分层时空模型。我们利用相邻地点的降水值信息来识别降水系统,允许降水系统之间存在不同的时空依赖性。利用 R INLA 的贝叶斯方法,我们将模型拟合到瓜纳华托州(墨西哥)的降水数据案例研究中,以了解该地区降水的时空依赖性。我们的研究结果表明了基于降水过程的模型的多功能性,并将其与截断高斯模型进行了比较。
{"title":"Regime-based precipitation modeling: A spatio-temporal approach","authors":"Carolina Euán , Ying Sun , Brian J. Reich","doi":"10.1016/j.spasta.2024.100818","DOIUrl":"10.1016/j.spasta.2024.100818","url":null,"abstract":"<div><p>In this paper, we propose a new regime-based model to describe spatio-temporal dynamics of precipitation data. Precipitation is one of the most essential factors for multiple human-related activities such as agriculture production. Therefore, a detailed and accurate understanding of the rain for a given region is needed. Motivated by the different formations of precipitation systems (convective, frontal, and orographic), we proposed a hierarchical regime-based spatio-temporal model for precipitation data. We use information about the values of neighboring sites to identify such regimes, allowing spatial and temporal dependence to be different among regimes. Using the Bayesian approach with R INLA, we fit our model to the Guanajuato state (Mexico) precipitation data case study to understand the spatial and temporal dependencies of precipitation in this region. Our findings show the regime-based model’s versatility and compare it with the truncated Gaussian model.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000095/pdfft?md5=34516482aa33a4d0c7231ce4614fe6c6&pid=1-s2.0-S2211675324000095-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140074621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.spasta.2024.100820
Mohammad Moradi , Jennifer Brown
Interpolation is commonly used in the construction of maps and images when there is limited information for some of the sites. The accuracy of interpolation methods depends, in part, on the location of the sample sites where more complete information has been gathered. An initial survey design where the sample sites are spaced so there is wide-spread coverage is desirable. However, when there is considerable variation in the variable of interest, other design features may be preferable. Here we introduce an adaptive design where in the first stage of site selection gives wide-spread coverage, and in subsequent stages additional sites are selected adjacent to areas of high variability.
{"title":"Mapping using an adaptive sampling design","authors":"Mohammad Moradi , Jennifer Brown","doi":"10.1016/j.spasta.2024.100820","DOIUrl":"10.1016/j.spasta.2024.100820","url":null,"abstract":"<div><p>Interpolation is commonly used in the construction of maps and images when there is limited information for some of the sites. The accuracy of interpolation methods depends, in part, on the location of the sample sites where more complete information has been gathered. An initial survey design where the sample sites are spaced so there is wide-spread coverage is desirable. However, when there is considerable variation in the variable of interest, other design features may be preferable. Here we introduce an adaptive design where in the first stage of site selection gives wide-spread coverage, and in subsequent stages additional sites are selected adjacent to areas of high variability.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140074924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1016/j.spasta.2024.100817
Mevin B. Hooten , Michael R. Schwob , Devin S. Johnson , Jacob S. Ivan
Methods for population estimation and inference have evolved over the past decade to allow for the incorporation of spatial information when using capture–recapture study designs. Traditional approaches to specifying spatial capture–recapture (SCR) models often rely on an individual-based detection function that decays as a detection location is farther from an individual’s activity center. Traditional SCR models are intuitive because they incorporate mechanisms of animal space use based on their assumptions about activity centers. We modify the SCR model to accommodate a wide range of space use patterns, including for those individuals that may exhibit traditional elliptical utilization distributions. Our approach uses underlying Gaussian processes to characterize the space use of individuals. This allows us to account for multimodal and other complex space use patterns that may arise due to movement. We refer to this class of models as geostatistical capture–recapture (GCR) models. We adapt a recursive computing strategy to fit GCR models to data in stages, some of which can be parallelized. This technique facilitates implementation and leverages modern multicore and distributed computing environments. We demonstrate the application of GCR models by analyzing both simulated data and a data set involving capture histories of snowshoe hares in central Colorado, USA.
{"title":"Geostatistical capture–recapture models","authors":"Mevin B. Hooten , Michael R. Schwob , Devin S. Johnson , Jacob S. Ivan","doi":"10.1016/j.spasta.2024.100817","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100817","url":null,"abstract":"<div><p>Methods for population estimation and inference have evolved over the past decade to allow for the incorporation of spatial information when using capture–recapture study designs. Traditional approaches to specifying spatial capture–recapture (SCR) models often rely on an individual-based detection function that decays as a detection location is farther from an individual’s activity center. Traditional SCR models are intuitive because they incorporate mechanisms of animal space use based on their assumptions about activity centers. We modify the SCR model to accommodate a wide range of space use patterns, including for those individuals that may exhibit traditional elliptical utilization distributions. Our approach uses underlying Gaussian processes to characterize the space use of individuals. This allows us to account for multimodal and other complex space use patterns that may arise due to movement. We refer to this class of models as geostatistical capture–recapture (GCR) models. We adapt a recursive computing strategy to fit GCR models to data in stages, some of which can be parallelized. This technique facilitates implementation and leverages modern multicore and distributed computing environments. We demonstrate the application of GCR models by analyzing both simulated data and a data set involving capture histories of snowshoe hares in central Colorado, USA.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000083/pdfft?md5=09305eb130f1cfc623cdc920435039a4&pid=1-s2.0-S2211675324000083-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139699661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1016/j.spasta.2024.100816
Qi Zhang , Alexandra M. Schmidt , Yogendra P. Chaubey
Commonly, observations from environmental processes are spatially structured and present skewed distributions. Recently, different models have been proposed to model spatial processes in their original scale. This work was motivated by modeling the levels of arsenic groundwater concentration in Comilla, a district of Bangladesh. Some of the observations are left censored. We propose spatial gamma models and explore different parametrizations of the gamma distribution. The gamma model naturally accounts for the skewness present in the data and the fact that arsenic levels are positive. We compare our proposed approaches with two skewed models proposed in the literature. Inference is performed under the Bayesian paradigm and interpolation to unobserved locations of interest naturally accounts for the estimation of the parameters in the proposed model. For the arsenic dataset, one of our proposed gamma models performs best in comparison to previous spatial models for skewed data, in terms of scoring rules criteria. Moreover, under the skewed models, some of the lower limits of the 95% posterior predictive distributions provide negative values violating the assumption that observations are strictly positive. The gamma distribution provides a reasonable, and simpler, alternative to account for the skewness present in the data and provide forecasts that are within the valid values of the observations.
{"title":"Modeling left-censored skewed spatial processes: The case of arsenic drinking water contamination","authors":"Qi Zhang , Alexandra M. Schmidt , Yogendra P. Chaubey","doi":"10.1016/j.spasta.2024.100816","DOIUrl":"https://doi.org/10.1016/j.spasta.2024.100816","url":null,"abstract":"<div><p>Commonly, observations from environmental processes are spatially structured and present skewed distributions. Recently, different models have been proposed to model spatial processes in their original scale. This work was motivated by modeling the levels of arsenic groundwater concentration in Comilla, a district of Bangladesh. Some of the observations are left censored. We propose spatial gamma models and explore different parametrizations of the gamma distribution. The gamma model naturally accounts for the skewness present in the data and the fact that arsenic levels are positive. We compare our proposed approaches with two skewed models proposed in the literature. Inference is performed under the Bayesian paradigm and interpolation to unobserved locations of interest naturally accounts for the estimation of the parameters in the proposed model. For the arsenic dataset, one of our proposed gamma models performs best in comparison to previous spatial models for skewed data, in terms of scoring rules criteria. Moreover, under the skewed models, some of the lower limits of the 95% posterior predictive distributions provide negative values violating the assumption that observations are strictly positive. The gamma distribution provides a reasonable, and simpler, alternative to account for the skewness present in the data and provide forecasts that are within the valid values of the observations.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000071/pdfft?md5=4e6657431bd93da01f8723d3f2fdc303&pid=1-s2.0-S2211675324000071-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139714900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial stratified heterogeneity, revealing the disparity mechanisms across spatial strata, can be effectively quantified using the geographical detector (GD). GD requires reasonable spatial discretization strategies to investigate the spatial association between the target variable and numerical independent variables. In previous studies, the Robust Geographical Detector (RGD) optimized spatial strata for examining the power of determinants (PD) of individual variables, which demonstrate more robust spatial discretization than other models. However, the GD's interaction detector that explores PD of the interaction of two variables still needs to be enhanced by the robust spatial discretization. This study develops a Robust Interaction Detector (RID), an improved interaction detector, using change detection algorithms for the robust spatial stratified heterogeneity analysis with multiple explanatory variables. RID is applied in a road life expectancy analysis in Western Australia. Results show that RID presents higher PD values than previous GD models, ensuring the growth of PD value with more spatial strata. The RID model indicates that the interactions between various transport variables and elevation are strongly associated with road life expectancy from the perspective of spatial patterns. The developed RID model provides significant potential for enhanced geospatial factor analysis across diverse fields.
{"title":"Robust interaction detector: A case of road life expectancy analysis","authors":"Zehua Zhang , Yongze Song , Lalinda Karunaratne , Peng Wu","doi":"10.1016/j.spasta.2024.100814","DOIUrl":"10.1016/j.spasta.2024.100814","url":null,"abstract":"<div><p>Spatial stratified heterogeneity, revealing the disparity mechanisms across spatial strata, can be effectively quantified using the geographical detector (GD). GD requires reasonable spatial discretization strategies to investigate the spatial association between the target variable and numerical independent variables. In previous studies, the Robust Geographical Detector (RGD) optimized spatial strata for examining the power of determinants (PD) of individual variables, which demonstrate more robust spatial discretization than other models. However, the GD's interaction detector that explores PD of the interaction of two variables still needs to be enhanced by the robust spatial discretization. This study develops a Robust Interaction Detector (RID), an improved interaction detector, using change detection algorithms for the robust spatial stratified heterogeneity analysis with multiple explanatory variables. RID is applied in a road life expectancy analysis in Western Australia. Results show that RID presents higher PD values than previous GD models, ensuring the growth of PD value with more spatial strata. The RID model indicates that the interactions between various transport variables and elevation are strongly associated with road life expectancy from the perspective of spatial patterns. The developed RID model provides significant potential for enhanced geospatial factor analysis across diverse fields.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000058/pdfft?md5=f61d206ff82268fb072a2711dc2fed1e&pid=1-s2.0-S2211675324000058-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139516662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-18DOI: 10.1016/j.spasta.2024.100812
Yuhan Ma , Kyuhee Shin , GyuWon Lee , Joon Jin Song
In recent decades, spatial classification has received considerable attention in a wide array of disciplines. In practice, binary response variable is often subject to measurement error, misclassification. To account for the misclassified response in spatial classification, we proposed validation data-based adjustment methods that use interval validation data to rectify misclassified responses. Regression calibration and multiple imputation methods are utilized to correct the misclassified outcomes at the locations where the gold-standard device is not available. Generalized linear mixed model and indicator Kriging are applied for spatial classification at unsampled locations. Simulation studies are performed to compare the proposed methods with naive methods that ignore the misclassification. It was found that the proposed models significantly improve prediction accuracy. Additionally, the proposed models are applied for precipitation detection in South Korea.
{"title":"Spatial classification in the presence of measurement error","authors":"Yuhan Ma , Kyuhee Shin , GyuWon Lee , Joon Jin Song","doi":"10.1016/j.spasta.2024.100812","DOIUrl":"10.1016/j.spasta.2024.100812","url":null,"abstract":"<div><p>In recent decades, spatial classification has received considerable attention in a wide array of disciplines. In practice, binary response variable is often subject to measurement error, misclassification. To account for the misclassified response in spatial classification, we proposed validation data-based adjustment methods that use interval validation data to rectify misclassified responses. Regression calibration and multiple imputation methods are utilized to correct the misclassified outcomes at the locations where the gold-standard device is not available. Generalized linear mixed model and indicator Kriging are applied for spatial classification at unsampled locations. Simulation studies are performed to compare the proposed methods with naive methods that ignore the misclassification. It was found that the proposed models significantly improve prediction accuracy. Additionally, the proposed models are applied for precipitation detection in South Korea.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000034/pdfft?md5=0a15ee5f09dc0fe93583f96e7eac46cf&pid=1-s2.0-S2211675324000034-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139501559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-17DOI: 10.1016/j.spasta.2024.100813
Kellie McClernon, Katherine Goode, Daniel Ries
As global temperatures continue to rise, climate mitigation strategies such as stratospheric aerosol injections (SAI) are increasingly discussed, but the downstream effects of these strategies are not well understood. As such, there is interest in developing statistical methods to quantify the evolution of climate variable relationships during the time period surrounding an SAI. Feature importance applied to echo state network (ESN) models has been proposed as a way to understand the effects of SAI using a data-driven model. This approach depends on the ESN fitting the data well. If not, the feature importance may place importance on features that are not representative of the underlying relationships. Typically, time series prediction models such as ESNs are assessed using out-of-sample performance metrics that divide the times series into separate training and testing sets. However, this model assessment approach is geared towards forecasting applications and not scenarios such as the motivating SAI example where the objective is using a data driven model to capture variable relationships. In this paper, we demonstrate a novel use of climate model replicates to investigate the applicability of the commonly used repeated hold-out model assessment approach for the SAI application. Simulations of an SAI are generated using a simplified climate model, and different initialization conditions are used to provide independent training and testing sets containing the same SAI event. The climate model replicates enable out-of-sample measures of model performance, which are compared to the single time series hold-out validation approach. For our case study, it is found that the repeated hold-out sample performance is comparable, but conservative, to the replicate out-of-sample performance when the training set contains enough time after the aerosol injection.
随着全球气温的持续上升,平流层气溶胶注入(SAI)等气候减缓战略越来越多地被讨论,但人们对这些战略的下游影响却不甚了解。因此,人们有兴趣开发统计方法来量化 SAI 期间气候变量关系的演变。有人提出将特征重要性应用于回波状态网络(ESN)模型,作为利用数据驱动模型了解 SAI 影响的一种方法。这种方法依赖于 ESN 与数据的良好拟合。否则,特征重要性可能会重视那些不能代表潜在关系的特征。通常情况下,时间序列预测模型(如 ESN)使用样本外性能指标进行评估,该指标将时间序列分为单独的训练集和测试集。然而,这种模型评估方法针对的是预测应用,而不是像激励性 SAI 示例这样的场景,其目标是使用数据驱动模型来捕捉变量关系。在本文中,我们展示了一种利用气候模型副本的新方法,以研究常用的重复保持模型评估方法在 SAI 应用中的适用性。使用简化的气候模式生成 SAI 模拟,并使用不同的初始化条件提供包含相同 SAI 事件的独立训练集和测试集。通过气候模型复制,可以对模型性能进行样本外测量,并与单一时间序列保持验证方法进行比较。对于我们的案例研究,当训练集包含气溶胶注入后的足够时间时,我们发现重复保持样本的性能与样本外复制性能相当,但比较保守。
{"title":"A comparison of model validation approaches for echo state networks using climate model replicates","authors":"Kellie McClernon, Katherine Goode, Daniel Ries","doi":"10.1016/j.spasta.2024.100813","DOIUrl":"10.1016/j.spasta.2024.100813","url":null,"abstract":"<div><p>As global temperatures continue to rise, climate mitigation strategies such as stratospheric aerosol injections (SAI) are increasingly discussed, but the downstream effects of these strategies are not well understood. As such, there is interest in developing statistical methods to quantify the evolution of climate variable relationships during the time period surrounding an SAI. Feature importance applied to echo state network (ESN) models has been proposed as a way to understand the effects of SAI using a data-driven model. This approach depends on the ESN fitting the data well. If not, the feature importance may place importance on features that are not representative of the underlying relationships. Typically, time series prediction models such as ESNs are assessed using out-of-sample performance metrics that divide the times series into separate training and testing sets. However, this model assessment approach is geared towards forecasting applications and not scenarios such as the motivating SAI example where the objective is using a data driven model to capture variable relationships. In this paper, we demonstrate a novel use of climate model replicates to investigate the applicability of the commonly used repeated hold-out model assessment approach for the SAI application. Simulations of an SAI are generated using a simplified climate model, and different initialization conditions are used to provide independent training and testing sets containing the same SAI event. The climate model replicates enable out-of-sample measures of model performance, which are compared to the single time series hold-out validation approach. For our case study, it is found that the repeated hold-out sample performance is comparable, but conservative, to the replicate out-of-sample performance when the training set contains enough time after the aerosol injection.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211675324000046/pdfft?md5=a6ba0350eda3c86948baceceabdf144c&pid=1-s2.0-S2211675324000046-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139496081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-17DOI: 10.1016/j.spasta.2023.100799
Hiroshi Yamada
This paper considers a filter for smoothing spatial data. It can be used to smooth data on the vertices of arbitrary undirected graphs with arbitrary non-negative spatial weights. It consists of a quantity analogous to Geary’s , which is one of the most prominent measures of spatial autocorrelation. In addition, the quantity can be represented by a matrix called the graph Laplacian in spectral graph theory. We show mathematically how spatial data becomes smoother as a parameter, called the smoothing parameter, increases from 0 and is fully smoothed as the parameter goes to infinity, except for the case where the spatial data is originally fully smoothed. We also illustrate the results numerically and apply the spatial filter to climatological/meteorological data. In addition, as supplementary investigations, we examine how the sum of squared residuals and the effective degrees of freedom vary with the smoothing parameter. Finally, we review two closely related literatures to the spatial filter. One is the intrinsic conditional autoregressive model and the other is the eigenvector spatial filter. We clarify how the spatial filter considered in this paper relates to them. We then mention future research.
本文研究了一种用于平滑空间数据的滤波器。它可用于平滑具有任意非负空间权重的任意无向图顶点上的数据。它包括一个与 Geary's c 类似的量,后者是空间自相关性最显著的测量方法之一。此外,这个量还可以用谱图理论中称为图拉普拉奇的矩阵来表示。我们用数学方法展示了空间数据如何随着一个参数(称为平滑参数)从 0 开始增加而变得更加平滑,以及随着参数增加到无穷大而完全平滑,但空间数据原本完全平滑的情况除外。我们还对结果进行了数值说明,并将空间滤波器应用于气候/气象数据。此外,作为补充研究,我们还考察了残差平方和及有效自由度如何随平滑参数变化。最后,我们回顾了与空间滤波器密切相关的两个文献。一个是本征条件自回归模型,另一个是特征向量空间滤波器。我们将阐明本文所考虑的空间滤波器与它们之间的关系。然后,我们将提及未来的研究。
{"title":"Spatial Smoothing Using Graph Laplacian Penalized Filter","authors":"Hiroshi Yamada","doi":"10.1016/j.spasta.2023.100799","DOIUrl":"10.1016/j.spasta.2023.100799","url":null,"abstract":"<div><p>This paper considers a filter for smoothing spatial data. It can be used to smooth data on the vertices of arbitrary undirected graphs with arbitrary non-negative spatial weights. It consists of a quantity analogous to Geary’s <span><math><mi>c</mi></math></span>, which is one of the most prominent measures of spatial autocorrelation. In addition, the quantity can be represented by a matrix called the graph Laplacian in spectral graph theory. We show mathematically how spatial data becomes smoother as a parameter, called the smoothing parameter, increases from 0 and is fully smoothed as the parameter goes to infinity, except for the case where the spatial data is originally fully smoothed. We also illustrate the results numerically and apply the spatial filter to climatological/meteorological data. In addition, as supplementary investigations, we examine how the sum of squared residuals and the effective degrees of freedom vary with the smoothing parameter. Finally, we review two closely related literatures to the spatial filter. One is the intrinsic conditional autoregressive model and the other is the eigenvector spatial filter. We clarify how the spatial filter considered in this paper relates to them. We then mention future research.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139496155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}