Pub Date : 2025-09-22DOI: 10.1016/j.spasta.2025.100932
Jinsheng Xie
This study aims to establish uncertain spatial statistics by exploring the uncertain spatial autoregressive model firstly. Modeling the observations of the response variable via uncertain variables and assuming they are affected by neighboring observations, this paper explores an approach of the uncertain spatial autoregressive model to estimate relationships among the uncertain variables with spatial locations. By employing the principle of least squares, a minimization problem is provided to estimate unknown parameters in the uncertain spatial autoregressive model. Finally, two real-world examples of regional economic analysis and regional air quality analysis are given to clearly demonstrate the uncertain spatial autoregressive model.
{"title":"Uncertain spatial autoregressive model with applications to regional economic analysis and regional air quality analysis","authors":"Jinsheng Xie","doi":"10.1016/j.spasta.2025.100932","DOIUrl":"10.1016/j.spasta.2025.100932","url":null,"abstract":"<div><div>This study aims to establish uncertain spatial statistics by exploring the uncertain spatial autoregressive model firstly. Modeling the observations of the response variable via uncertain variables and assuming they are affected by neighboring observations, this paper explores an approach of the uncertain spatial autoregressive model to estimate relationships among the uncertain variables with spatial locations. By employing the principle of least squares, a minimization problem is provided to estimate unknown parameters in the uncertain spatial autoregressive model. Finally, two real-world examples of regional economic analysis and regional air quality analysis are given to clearly demonstrate the uncertain spatial autoregressive model.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100932"},"PeriodicalIF":2.5,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1016/j.spasta.2025.100930
Daniela Silva , Raquel Menezes , Gonçalo Araújo , Renato Rosa , Ana Moreno , Alexandra Silva , Susana Garrido
Accurately identifying spatial patterns of species distribution is crucial for scientific insight and societal benefit, aiding our understanding of species fluctuations. The increasing quantity and quality of ecological datasets present heightened statistical challenges, complicating spatial species dynamics comprehension. Addressing the complex task of integrating multiple data sources to enhance spatial fish distribution understanding in marine ecology, this study introduces a pioneering five-layer Joint model. The model adeptly integrates fishery-independent and fishery-dependent data, accommodating zero-inflated data and distinct sampling processes. A comprehensive simulation study evaluates the model performance across various preferential sampling scenarios and sample sizes, elucidating its advantages and challenges. Our findings highlight the model’s robustness in estimating preferential parameters, emphasizing differentiation between presence–absence and biomass observations. Evaluation of estimation of spatial covariance and prediction performance underscores the model’s reliability. Augmenting sample sizes reduces parameter estimation variability, aligning with the principle that increased information enhances certainty. Assessing the contribution of each data source reveals successful integration, providing a comprehensive representation of biomass patterns. Empirical application within a real-world context further solidifies the model’s efficacy in capturing species’ spatial distribution. This research advances methodologies for integrating diverse datasets with different sampling natures further contributing to a more informed understanding of spatial dynamics of marine species.
{"title":"Joint model for zero-inflated data combining fishery-dependent and fishery-independent sources","authors":"Daniela Silva , Raquel Menezes , Gonçalo Araújo , Renato Rosa , Ana Moreno , Alexandra Silva , Susana Garrido","doi":"10.1016/j.spasta.2025.100930","DOIUrl":"10.1016/j.spasta.2025.100930","url":null,"abstract":"<div><div>Accurately identifying spatial patterns of species distribution is crucial for scientific insight and societal benefit, aiding our understanding of species fluctuations. The increasing quantity and quality of ecological datasets present heightened statistical challenges, complicating spatial species dynamics comprehension. Addressing the complex task of integrating multiple data sources to enhance spatial fish distribution understanding in marine ecology, this study introduces a pioneering five-layer Joint model. The model adeptly integrates fishery-independent and fishery-dependent data, accommodating zero-inflated data and distinct sampling processes. A comprehensive simulation study evaluates the model performance across various preferential sampling scenarios and sample sizes, elucidating its advantages and challenges. Our findings highlight the model’s robustness in estimating preferential parameters, emphasizing differentiation between presence–absence and biomass observations. Evaluation of estimation of spatial covariance and prediction performance underscores the model’s reliability. Augmenting sample sizes reduces parameter estimation variability, aligning with the principle that increased information enhances certainty. Assessing the contribution of each data source reveals successful integration, providing a comprehensive representation of biomass patterns. Empirical application within a real-world context further solidifies the model’s efficacy in capturing species’ spatial distribution. This research advances methodologies for integrating diverse datasets with different sampling natures further contributing to a more informed understanding of spatial dynamics of marine species.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100930"},"PeriodicalIF":2.5,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-10DOI: 10.1016/j.spasta.2025.100929
Zhenhua Wang , Paul A. Parker , Scott H. Holan
Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay–Herriot model is one such approach that incorporates spatial dependence to improve estimation by borrowing strength from neighboring regions. However, this approach often requires substantial computational resources, limiting its scalability for high-dimensional datasets, especially when considering multiple (multivariate) responses. This paper proposes two methods that integrate the multivariate spatial Fay–Herriot model with spatial random effects, learned through variational autoencoders, to efficiently leverage spatial structure. Importantly, after training the variational autoencoder to represent spatial dependence for a given set of geographies, it may be used again in future modeling efforts, without the need for retraining. Additionally, the use of the variational autoencoder to represent spatial dependence results in extreme improvements in computational efficiency, even for massive datasets. We demonstrate the effectiveness of our approach using 5-year period estimates from the American Community Survey over all census tracts in California.
{"title":"Variational autoencoded multivariate spatial Fay–Herriot models","authors":"Zhenhua Wang , Paul A. Parker , Scott H. Holan","doi":"10.1016/j.spasta.2025.100929","DOIUrl":"10.1016/j.spasta.2025.100929","url":null,"abstract":"<div><div>Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay–Herriot model is one such approach that incorporates spatial dependence to improve estimation by borrowing strength from neighboring regions. However, this approach often requires substantial computational resources, limiting its scalability for high-dimensional datasets, especially when considering multiple (multivariate) responses. This paper proposes two methods that integrate the multivariate spatial Fay–Herriot model with spatial random effects, learned through variational autoencoders, to efficiently leverage spatial structure. Importantly, after training the variational autoencoder to represent spatial dependence for a given set of geographies, it may be used again in future modeling efforts, without the need for retraining. Additionally, the use of the variational autoencoder to represent spatial dependence results in extreme improvements in computational efficiency, even for massive datasets. We demonstrate the effectiveness of our approach using 5-year period estimates from the American Community Survey over all census tracts in California.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100929"},"PeriodicalIF":2.5,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-26DOI: 10.1016/j.spasta.2025.100927
Giampiero M. Gallo , Demetrio Lacava , Edoardo Otranto
This paper introduces a novel two-stage modeling framework that combines Markov Switching (MS) models with an autoregressive model augmented by spatial effects to analyze the dynamics and spatial interdependence of Biden’s polling percentages during the 2020 electoral campaign. In the first stage, we employ MS models to segment each state’s daily polling time series into distinct regimes — interpreted as phases of decline, stability, and growth. This segmentation captures abrupt changes and local trends in public opinion, enabling us to link regime shifts with key political events such as debates, party conventions, and milestone campaign achievements. The inherent nonlinearity of polling data would otherwise be lost by first differencing. By removing the regime-specific components, we generate stationary residuals modeled using an Autoregressive model with exogenous variables (ARX) that incorporates political spatial interactions through two complementary effects. The spillover effect captures lagged influences arising from politically influential states, while the contagion effect reflects the contemporaneous impact of neighboring states. A recursive algorithm based on partial correlations is implemented to select the most relevant spillover sources for each state. Empirical results, based on daily data from 13 swing states, reveal robust evidence of persistent regime structures and marked spatial dependencies. While contagion effects are uniformly significant across states, spillover dynamics exhibit considerable heterogeneity in both magnitude and direction. This integrated modeling approach enhances our understanding of the complex, nonlinear temporal evolution of polling trends and the spatial diffusion of political opinions that underpinned the 2020 electoral outcome.
{"title":"Regime changes and spatial dependence in the 2020 US presidential election polls","authors":"Giampiero M. Gallo , Demetrio Lacava , Edoardo Otranto","doi":"10.1016/j.spasta.2025.100927","DOIUrl":"10.1016/j.spasta.2025.100927","url":null,"abstract":"<div><div>This paper introduces a novel two-stage modeling framework that combines Markov Switching (MS) models with an autoregressive model augmented by spatial effects to analyze the dynamics and spatial interdependence of Biden’s polling percentages during the 2020 electoral campaign. In the first stage, we employ MS models to segment each state’s daily polling time series into distinct regimes — interpreted as phases of decline, stability, and growth. This segmentation captures abrupt changes and local trends in public opinion, enabling us to link regime shifts with key political events such as debates, party conventions, and milestone campaign achievements. The inherent nonlinearity of polling data would otherwise be lost by first differencing. By removing the regime-specific components, we generate stationary residuals modeled using an Autoregressive model with exogenous variables (ARX) that incorporates political spatial interactions through two complementary effects. The spillover effect captures lagged influences arising from politically influential states, while the contagion effect reflects the contemporaneous impact of neighboring states. A recursive algorithm based on partial correlations is implemented to select the most relevant spillover sources for each state. Empirical results, based on daily data from 13 swing states, reveal robust evidence of persistent regime structures and marked spatial dependencies. While contagion effects are uniformly significant across states, spillover dynamics exhibit considerable heterogeneity in both magnitude and direction. This integrated modeling approach enhances our understanding of the complex, nonlinear temporal evolution of polling trends and the spatial diffusion of political opinions that underpinned the 2020 electoral outcome.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100927"},"PeriodicalIF":2.5,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-22DOI: 10.1016/j.spasta.2025.100926
Wentao Wang , Dengkui Li
This paper considers spatial autoregressive kink models with an unknown threshold, where the impact of a specific explanatory variable on the response variable is piecewise linear but differs below and above this threshold. To address the endogeneity issue, the paper presents the modified generalized method of moments (GMM) that consistently estimates the threshold location and slope changes. Asymptotic properties, including the consistency and asymptotic normality of the GMM estimators, and the limiting distribution of the Sup-Wald statistic, are established under a set of regularity assumptions. In view of the nonstandard asymptotic null distribution, we use a multiplier bootstrap to approximate the -value of the Sup-Wald statistic to detect the presence of the threshold. Simulation study illustrates that the estimators and inference are well-behaved in finite samples. An empirical application to the secondary industrial structure data of 280 Chinese prefecture-level cities further highlights the practical merits of our methods.
{"title":"GMM inference for the spatial autoregressive kink model with an unknown threshold","authors":"Wentao Wang , Dengkui Li","doi":"10.1016/j.spasta.2025.100926","DOIUrl":"10.1016/j.spasta.2025.100926","url":null,"abstract":"<div><div>This paper considers spatial autoregressive kink models with an unknown threshold, where the impact of a specific explanatory variable on the response variable is piecewise linear but differs below and above this threshold. To address the endogeneity issue, the paper presents the modified generalized method of moments (GMM) that consistently estimates the threshold location and slope changes. Asymptotic properties, including the consistency and asymptotic normality of the GMM estimators, and the limiting distribution of the Sup-Wald statistic, are established under a set of regularity assumptions. In view of the nonstandard asymptotic null distribution, we use a multiplier bootstrap to approximate the <span><math><mi>p</mi></math></span>-value of the Sup-Wald statistic to detect the presence of the threshold. Simulation study illustrates that the estimators and inference are well-behaved in finite samples. An empirical application to the secondary industrial structure data of 280 Chinese prefecture-level cities further highlights the practical merits of our methods.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100926"},"PeriodicalIF":2.5,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-05DOI: 10.1016/j.spasta.2025.100924
Christophe A.N. Biscio , Adrien Mazoyer , Martin V. Vejling
Monte Carlo tests are widely used for computing valid -values without requiring known distributions of test statistics. When performing multiple Monte Carlo tests, it is essential to maintain control of the type I error. Some techniques for multiplicity control pose requirements on the joint distribution of the -values, for instance independence, which can be computationally intensive to achieve, as it requires simulating disjoint null samples for each test. We refer to this as naïve multiple Monte Carlo testing. We highlight in this work that multiple Monte Carlo testing is an instance of conformal novelty detection. Leveraging this insight enables a more efficient multiple Monte Carlo testing procedure, avoiding excessive simulations by using a single null sample for all the tests, while still ensuring exact control over the false discovery rate or the family-wise error rate. We call this approach conformal multiple Monte Carlo testing. The performance is investigated in the context of global envelope tests for point pattern data through a simulation study and an application to a sweat gland data set. Results reveal that with a fixed simulation budget, our proposed method yields substantial improvements in power of the testing procedure as compared to the naïve multiple Monte Carlo testing procedure.
{"title":"Conformal novelty detection for replicate point patterns with FDR or FWER control","authors":"Christophe A.N. Biscio , Adrien Mazoyer , Martin V. Vejling","doi":"10.1016/j.spasta.2025.100924","DOIUrl":"10.1016/j.spasta.2025.100924","url":null,"abstract":"<div><div>Monte Carlo tests are widely used for computing valid <span><math><mi>p</mi></math></span>-values without requiring known distributions of test statistics. When performing multiple Monte Carlo tests, it is essential to maintain control of the type I error. Some techniques for multiplicity control pose requirements on the joint distribution of the <span><math><mi>p</mi></math></span>-values, for instance independence, which can be computationally intensive to achieve, as it requires simulating disjoint null samples for each test. We refer to this as naïve multiple Monte Carlo testing. We highlight in this work that multiple Monte Carlo testing is an instance of conformal novelty detection. Leveraging this insight enables a more efficient multiple Monte Carlo testing procedure, avoiding excessive simulations by using a single null sample for all the tests, while still ensuring exact control over the false discovery rate or the family-wise error rate. We call this approach conformal multiple Monte Carlo testing. The performance is investigated in the context of global envelope tests for point pattern data through a simulation study and an application to a sweat gland data set. Results reveal that with a fixed simulation budget, our proposed method yields substantial improvements in power of the testing procedure as compared to the naïve multiple Monte Carlo testing procedure.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100924"},"PeriodicalIF":2.5,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144860831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-29DOI: 10.1016/j.spasta.2025.100925
Gang Xu , Qirui Zhang , Xinlei Xu , Yajie Zhang , Yansheng Li
Understanding the fine-scale spatial dynamics of infectious disease outbreaks is essential for effective urban epidemic response. This study leverages a novel dataset of over 2700 community-level epidemic notifications, shared publicly in residential areas and through social media during the early COVID-19 outbreak in Wuhan, China, to map the intra-urban spread of the virus from February 2 to March 4, 2020. After manually structuring and geocoding these notifications, we constructed a high-resolution spatiotemporal dataset of 13,346 confirmed cases across 1532 neighborhoods. Using spatial statistical techniques, we identified the evolution of spatial clustering, directional shifts in epidemic centers, and seven statistically significant spatio-temporal clusters with relative risks ranging from 1.21 to 12.48. Our results reveal the critical role of urban morphology, population density, and built environment characteristics in shaping transmission dynamics. Notably, Qingshan District emerged as a persistent hotspot due to its open neighborhood design and delayed compliance with containment measures. This research underscores the value of Volunteered Geographic Information (VGI) for early, fine-scale epidemic monitoring and demonstrates its utility as a complement to official surveillance systems in public emergencies.
{"title":"Spatiotemporal dynamics of COVID-19 in Wuhan based on community notifications","authors":"Gang Xu , Qirui Zhang , Xinlei Xu , Yajie Zhang , Yansheng Li","doi":"10.1016/j.spasta.2025.100925","DOIUrl":"10.1016/j.spasta.2025.100925","url":null,"abstract":"<div><div>Understanding the fine-scale spatial dynamics of infectious disease outbreaks is essential for effective urban epidemic response. This study leverages a novel dataset of over 2700 community-level epidemic notifications, shared publicly in residential areas and through social media during the early COVID-19 outbreak in Wuhan, China, to map the intra-urban spread of the virus from February 2 to March 4, 2020. After manually structuring and geocoding these notifications, we constructed a high-resolution spatiotemporal dataset of 13,346 confirmed cases across 1532 neighborhoods. Using spatial statistical techniques, we identified the evolution of spatial clustering, directional shifts in epidemic centers, and seven statistically significant spatio-temporal clusters with relative risks ranging from 1.21 to 12.48. Our results reveal the critical role of urban morphology, population density, and built environment characteristics in shaping transmission dynamics. Notably, Qingshan District emerged as a persistent hotspot due to its open neighborhood design and delayed compliance with containment measures. This research underscores the value of Volunteered Geographic Information (VGI) for early, fine-scale epidemic monitoring and demonstrates its utility as a complement to official surveillance systems in public emergencies.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100925"},"PeriodicalIF":2.5,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144771829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-28DOI: 10.1016/j.spasta.2025.100923
Won Chang , Youngdeok Hwang , Hang J. Kim
Satellite images using multiple wavelength channels provide crucial measurements over large areas, aiding the understanding of pollution generation and transport. However, these images often contain missing data due to cloud cover and algorithm limitations. In this paper, we introduce a novel method for interpolating missing values in satellite images by incorporating pollution transport dynamics influenced by wind patterns. Our approach utilizes a fundamental physics equation to structure the covariance of missing data, improving accuracy by considering pollution transport dynamics. To address computational challenges associated with large datasets, we implement a gradient ascent algorithm. We demonstrate the effectiveness of our method through a case study, showcasing its potential for accurate interpolation in high-resolution, spatio-temporal air pollution datasets.
{"title":"Physics-driven dynamic interpolation with application to pollution satellite images","authors":"Won Chang , Youngdeok Hwang , Hang J. Kim","doi":"10.1016/j.spasta.2025.100923","DOIUrl":"10.1016/j.spasta.2025.100923","url":null,"abstract":"<div><div>Satellite images using multiple wavelength channels provide crucial measurements over large areas, aiding the understanding of pollution generation and transport. However, these images often contain missing data due to cloud cover and algorithm limitations. In this paper, we introduce a novel method for interpolating missing values in satellite images by incorporating pollution transport dynamics influenced by wind patterns. Our approach utilizes a fundamental physics equation to structure the covariance of missing data, improving accuracy by considering pollution transport dynamics. To address computational challenges associated with large datasets, we implement a gradient ascent algorithm. We demonstrate the effectiveness of our method through a case study, showcasing its potential for accurate interpolation in high-resolution, spatio-temporal air pollution datasets.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100923"},"PeriodicalIF":2.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-26DOI: 10.1016/j.spasta.2025.100921
Jonathan Acosta , Ronny Vallejos , Pilar García-Soidán
The variogram function plays a key role in modeling intrinsically stationary random fields, especially in spatial prediction using kriging equations. However, determining whether a computed variogram accurately fits the underlying dependence structure can be challenging. Current nonparametric estimators often fail to guarantee a conditionally negative definite function. In this paper, we propose a new valid variogram estimator, constructed as a linear combination of functions from a predefined class, ensuring it meets essential mathematical properties. A penalty coefficient is introduced to prevent overfitting, reducing spurious fluctuations in the estimated variogram. We also extend the concept of effective sample size (ESS), an important metric in spatial regression, to a nonparametric framework. Our ESS estimator is based on the reciprocal of the average correlation and is calculated using a plug-in approach, with the consistency of the estimator being demonstrated. The performance of these estimates is investigated through Monte Carlo simulations across various scenarios. Finally, we apply the methodology to rasterized forest images, illustrating both the strengths and limitations of the proposed approach.
{"title":"A penalized estimation of the variogram and effective sample size","authors":"Jonathan Acosta , Ronny Vallejos , Pilar García-Soidán","doi":"10.1016/j.spasta.2025.100921","DOIUrl":"10.1016/j.spasta.2025.100921","url":null,"abstract":"<div><div>The variogram function plays a key role in modeling intrinsically stationary random fields, especially in spatial prediction using kriging equations. However, determining whether a computed variogram accurately fits the underlying dependence structure can be challenging. Current nonparametric estimators often fail to guarantee a conditionally negative definite function. In this paper, we propose a new valid variogram estimator, constructed as a linear combination of functions from a predefined class, ensuring it meets essential mathematical properties. A penalty coefficient is introduced to prevent overfitting, reducing spurious fluctuations in the estimated variogram. We also extend the concept of effective sample size (ESS), an important metric in spatial regression, to a nonparametric framework. Our ESS estimator is based on the reciprocal of the average correlation and is calculated using a plug-in approach, with the consistency of the estimator being demonstrated. The performance of these estimates is investigated through Monte Carlo simulations across various scenarios. Finally, we apply the methodology to rasterized forest images, illustrating both the strengths and limitations of the proposed approach.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100921"},"PeriodicalIF":2.5,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144771828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates a spatial autoregressive (SAR) panel data model featuring fixed effects and time-varying coefficients in both the covariates and spatial dependence. We propose a two-stage least squares estimation based on local linear dummy variables (2SLS-LLDV). This method effectively captures individual heterogeneity via dummy variable construction while maintaining computational tractability. Under mild regularity conditions, we establish the asymptotic normality of the proposed estimators. Furthermore, we devise a residual-based bootstrap procedure to test the temporal stability of time-varying spatial dependence parameter, providing a robust mechanism for p-value calculation in finite-sample scenarios. Monte Carlo simulations are conducted to evaluate the finite sample performance of our proposed methods. Finally, we employ our proposed estimation and testing methods to analyze carbon emissions in China and cigarette demand in the United States, demonstrating their practical applicability.
{"title":"Estimation and testing of time-varying coefficients spatial autoregressive panel data model","authors":"Lingling Tian , Chuanhua Wei , Wenxing Ding , Mixia Wu","doi":"10.1016/j.spasta.2025.100922","DOIUrl":"10.1016/j.spasta.2025.100922","url":null,"abstract":"<div><div>This paper investigates a spatial autoregressive (SAR) panel data model featuring fixed effects and time-varying coefficients in both the covariates and spatial dependence. We propose a two-stage least squares estimation based on local linear dummy variables (2SLS-LLDV). This method effectively captures individual heterogeneity via dummy variable construction while maintaining computational tractability. Under mild regularity conditions, we establish the asymptotic normality of the proposed estimators. Furthermore, we devise a residual-based bootstrap procedure to test the temporal stability of time-varying spatial dependence parameter, providing a robust mechanism for p-value calculation in finite-sample scenarios. Monte Carlo simulations are conducted to evaluate the finite sample performance of our proposed methods. Finally, we employ our proposed estimation and testing methods to analyze carbon emissions in China and cigarette demand in the United States, demonstrating their practical applicability.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"69 ","pages":"Article 100922"},"PeriodicalIF":2.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144713187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}