Pub Date : 2025-12-01Epub Date: 2025-09-25DOI: 10.1016/j.spasta.2025.100935
Kabelo Mahloromela, Inger Fabris-Rotelli
A point pattern is typically analysed to understand the first- and second-order properties of the underlying point process. These properties are usually inferred using estimation procedures that depend on interpoint distance and are thus sensitive to the choice of distance metric. Euclidean distance is conventionally used to quantify proximity between points, but it does not accurately reflect spatial relationships when points are constrained within irregular, nonconvex spatial domains. Herein, we propose a strategy to embed visibility graph distances into Euclidean metric space using multidimensional scaling. The aim is to simplify analyses, leverage well-developed methods based on Euclidean distance, and retain, as far as possible, the true proximity relationships on a nonconvex spatial domain. The kernel smoothed intensity estimate and the -function are computed in this new spatial context and used to validate the effectiveness of the embedding strategy.
{"title":"A framework for analysing point patterns on nonconvex domains using visibility graphs and multidimensional scaling","authors":"Kabelo Mahloromela, Inger Fabris-Rotelli","doi":"10.1016/j.spasta.2025.100935","DOIUrl":"10.1016/j.spasta.2025.100935","url":null,"abstract":"<div><div>A point pattern is typically analysed to understand the first- and second-order properties of the underlying point process. These properties are usually inferred using estimation procedures that depend on interpoint distance and are thus sensitive to the choice of distance metric. Euclidean distance is conventionally used to quantify proximity between points, but it does not accurately reflect spatial relationships when points are constrained within irregular, nonconvex spatial domains. Herein, we propose a strategy to embed visibility graph distances into Euclidean metric space using multidimensional scaling. The aim is to simplify analyses, leverage well-developed methods based on Euclidean distance, and retain, as far as possible, the true proximity relationships on a nonconvex spatial domain. The kernel smoothed intensity estimate and the <span><math><mi>K</mi></math></span>-function are computed in this new spatial context and used to validate the effectiveness of the embedding strategy.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100935"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-25DOI: 10.1016/j.spasta.2025.100928
Yangha Chung , Ji Meng Loh , Woncheol Jang
We introduce a doubly-smoothed bandwidth selection method to obtain bandwidth matrices for estimating the intensity function of a spatial point process. The doubly-smoothed bootstrap involves taking bootstrap samples by adding random noise and using Dirichlet rather than multinomial weights. The mean integrated squared error (MISE) and asymptotic mean integrated squared error (AMISE) as a function of can then be computed numerically using the bootstrap samples, with optimal obtained by minimizing the MISE or AMISE with respect to We present simulation results comparing the doubly-smoothed bandwidth selection method with other methods for a number of intensity functions. We also apply our methods to a data set of police pedestrian stops in New York City.
{"title":"Bandwidth selection for the intensity in spatial point processes","authors":"Yangha Chung , Ji Meng Loh , Woncheol Jang","doi":"10.1016/j.spasta.2025.100928","DOIUrl":"10.1016/j.spasta.2025.100928","url":null,"abstract":"<div><div>We introduce a doubly-smoothed bandwidth selection method to obtain bandwidth matrices <span><math><mi>H</mi></math></span> for estimating the intensity function of a spatial point process. The doubly-smoothed bootstrap involves taking bootstrap samples by adding random noise and using Dirichlet rather than multinomial weights. The mean integrated squared error (MISE) and asymptotic mean integrated squared error (AMISE) as a function of <span><math><mi>H</mi></math></span> can then be computed numerically using the bootstrap samples, with optimal <span><math><mi>H</mi></math></span> obtained by minimizing the MISE or AMISE with respect to <span><math><mrow><mi>H</mi><mo>.</mo></mrow></math></span> We present simulation results comparing the doubly-smoothed bandwidth selection method with other methods for a number of intensity functions. We also apply our methods to a data set of police pedestrian stops in New York City.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100928"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-22DOI: 10.1016/j.spasta.2025.100932
Jinsheng Xie
This study aims to establish uncertain spatial statistics by exploring the uncertain spatial autoregressive model firstly. Modeling the observations of the response variable via uncertain variables and assuming they are affected by neighboring observations, this paper explores an approach of the uncertain spatial autoregressive model to estimate relationships among the uncertain variables with spatial locations. By employing the principle of least squares, a minimization problem is provided to estimate unknown parameters in the uncertain spatial autoregressive model. Finally, two real-world examples of regional economic analysis and regional air quality analysis are given to clearly demonstrate the uncertain spatial autoregressive model.
{"title":"Uncertain spatial autoregressive model with applications to regional economic analysis and regional air quality analysis","authors":"Jinsheng Xie","doi":"10.1016/j.spasta.2025.100932","DOIUrl":"10.1016/j.spasta.2025.100932","url":null,"abstract":"<div><div>This study aims to establish uncertain spatial statistics by exploring the uncertain spatial autoregressive model firstly. Modeling the observations of the response variable via uncertain variables and assuming they are affected by neighboring observations, this paper explores an approach of the uncertain spatial autoregressive model to estimate relationships among the uncertain variables with spatial locations. By employing the principle of least squares, a minimization problem is provided to estimate unknown parameters in the uncertain spatial autoregressive model. Finally, two real-world examples of regional economic analysis and regional air quality analysis are given to clearly demonstrate the uncertain spatial autoregressive model.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100932"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Small area estimation (SAE) addresses the estimation of parameters for population subsets when the sample itself is too small to produce reliable direct estimates. The standard method, empirical best linear unbiased prediction, uses a predictor under a linear mixed model that assumes normality of the variable of interest and independence among small areas. However, in practical studies, the distribution of the variable of interest tends to be positively skewed and there exists spatial dependence among the small areas. To address both of these, a previous study had proposed a spatial synthetic (SYNT) predictor that predicts non-sampled values of the variable of interest using its unconditional mean. The SYNT predictor is derived based on a unit-level spatial lognormal mixed model. Herein, we propose spatial empirical best predictor (SEBP) to improve the SYNT predictor by using its conditional mean to predict the non-sampled values of the variable of interest. We perform simulation studies to evaluate the performance of SEBP and compare them with those of the SYNT predictor and other existing methods. Our results reveal that the SEBP performs better in terms of the average relative bias and average relative root mean square error when the spatial correlation among small areas is small, medium or large. In an SAE application on the average monthly household per-capita expenditure for sub-districts in Bogor, Indonesia, the proposed SEBP provides better estimates than other established methods.
{"title":"Spatial empirical best predictor of small area linear parameter for positively skewed outcomes","authors":"Dian Handayani , Khairil Anwar Notodiputro , Asep Saefuddin , I Wayan Mangku , Anang Kurnia","doi":"10.1016/j.spasta.2025.100941","DOIUrl":"10.1016/j.spasta.2025.100941","url":null,"abstract":"<div><div>Small area estimation (SAE) addresses the estimation of parameters for population subsets when the sample itself is too small to produce reliable direct estimates. The standard method, empirical best linear unbiased prediction, uses a predictor under a linear mixed model that assumes normality of the variable of interest and independence among small areas. However, in practical studies, the distribution of the variable of interest tends to be positively skewed and there exists spatial dependence among the small areas. To address both of these, a previous study had proposed a spatial synthetic (SYNT) predictor that predicts non-sampled values of the variable of interest using its unconditional mean. The SYNT predictor is derived based on a unit-level spatial lognormal mixed model. Herein, we propose spatial empirical best predictor (SEBP) to improve the SYNT predictor by using its conditional mean to predict the non-sampled values of the variable of interest. We perform simulation studies to evaluate the performance of SEBP and compare them with those of the SYNT predictor and other existing methods. Our results reveal that the SEBP performs better in terms of the average relative bias and average relative root mean square error when the spatial correlation among small areas is small, medium or large. In an SAE application on the average monthly household per-capita expenditure for sub-districts in Bogor, Indonesia, the proposed SEBP provides better estimates than other established methods.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100941"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145528797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-23DOI: 10.1016/j.spasta.2025.100934
Giovanni Millo
We address the issue of specifying a spatial lag vs. spatial error process in spatial panel models. The popular locally robust Lagrange multiplier (RLM) tests for spatial lag vs. error are compared to optimal alternatives based on maximum likelihood estimation: Wald and likelihood ratio (LR) tests requiring estimation of the full encompassing model, and conditional Lagrange multiplier (CLM) tests drawing on the reduced specification. Monte Carlo simulations are performed in a typical spatial panel context. Individual effects are successfully eliminated through the forward orthogonal deviations transformation, making the RLM suitable for panel data. Nevertheless, the statistical properties of Wald and LR are superior to those of the RLM. The CLM also dominates the RLM, as long as the sample is at least of moderate size. The RLM are computationally very convenient, but ML-based tests are feasible in most usage cases on mainstream hardware.
{"title":"Specifying spatial effects in panel data: Locally robust vs. conditional tests","authors":"Giovanni Millo","doi":"10.1016/j.spasta.2025.100934","DOIUrl":"10.1016/j.spasta.2025.100934","url":null,"abstract":"<div><div>We address the issue of specifying a spatial lag vs. spatial error process in spatial panel models. The popular locally robust Lagrange multiplier (RLM) tests for spatial lag vs. error are compared to optimal alternatives based on maximum likelihood estimation: Wald and likelihood ratio (LR) tests requiring estimation of the full encompassing model, and conditional Lagrange multiplier (CLM) tests drawing on the reduced specification. Monte Carlo simulations are performed in a typical spatial panel context. Individual effects are successfully eliminated through the forward orthogonal deviations transformation, making the RLM suitable for panel data. Nevertheless, the statistical properties of Wald and LR are superior to those of the RLM. The CLM also dominates the RLM, as long as the sample is at least of moderate size. The RLM are computationally very convenient, but ML-based tests are feasible in most usage cases on mainstream hardware.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100934"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-10-01DOI: 10.1016/j.spasta.2025.100938
Denis Allard
This paper illustrates how progress in spatial statistics is fueled by scientific questions arising from applications in agriculture and environment. The unifying theme is the work that has been carried out at BioSP, a statistics and mathematics research unit mainly affiliated to the “Mathematics and Digital Technologies” division at INRAE, the French National Research Institute for Agriculture, Food and Environment. Starting from the 20 contributions that BioSP members have published in Spatial Statistics since its creation in 2012, almost fifteen years of advances are reviewed, spanning point processes, (multivariate) spatio-temporal Gaussian processes, compositional data, stochastic weather generators and extreme value theory. Most of the content is focused on theoretical and methodological developments, with examples being limited due to length constraints for the article. Attention is given to how these advances have been inspired by problems arising in other research domains. In return, it will be shown how they have opened new research questions in spatial statistics and how they had impact in the scientific fields they originated from. In conclusion, some perspectives and outlooks are discussed, in particular in relation to the AI revolution.
{"title":"Growth of spatial statistics for agriculture and environment: The example of BioSP at INRAE","authors":"Denis Allard","doi":"10.1016/j.spasta.2025.100938","DOIUrl":"10.1016/j.spasta.2025.100938","url":null,"abstract":"<div><div>This paper illustrates how progress in spatial statistics is fueled by scientific questions arising from applications in agriculture and environment. The unifying theme is the work that has been carried out at BioSP, a statistics and mathematics research unit mainly affiliated to the “Mathematics and Digital Technologies” division at INRAE, the French National Research Institute for Agriculture, Food and Environment. Starting from the 20 contributions that BioSP members have published in <em>Spatial Statistics</em> since its creation in 2012, almost fifteen years of advances are reviewed, spanning point processes, (multivariate) spatio-temporal Gaussian processes, compositional data, stochastic weather generators and extreme value theory. Most of the content is focused on theoretical and methodological developments, with examples being limited due to length constraints for the article. Attention is given to how these advances have been inspired by problems arising in other research domains. In return, it will be shown how they have opened new research questions in spatial statistics and how they had impact in the scientific fields they originated from. In conclusion, some perspectives and outlooks are discussed, in particular in relation to the AI revolution.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100938"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-11DOI: 10.1016/j.spasta.2025.100930
Daniela Silva , Raquel Menezes , Gonçalo Araújo , Renato Rosa , Ana Moreno , Alexandra Silva , Susana Garrido
Accurately identifying spatial patterns of species distribution is crucial for scientific insight and societal benefit, aiding our understanding of species fluctuations. The increasing quantity and quality of ecological datasets present heightened statistical challenges, complicating spatial species dynamics comprehension. Addressing the complex task of integrating multiple data sources to enhance spatial fish distribution understanding in marine ecology, this study introduces a pioneering five-layer Joint model. The model adeptly integrates fishery-independent and fishery-dependent data, accommodating zero-inflated data and distinct sampling processes. A comprehensive simulation study evaluates the model performance across various preferential sampling scenarios and sample sizes, elucidating its advantages and challenges. Our findings highlight the model’s robustness in estimating preferential parameters, emphasizing differentiation between presence–absence and biomass observations. Evaluation of estimation of spatial covariance and prediction performance underscores the model’s reliability. Augmenting sample sizes reduces parameter estimation variability, aligning with the principle that increased information enhances certainty. Assessing the contribution of each data source reveals successful integration, providing a comprehensive representation of biomass patterns. Empirical application within a real-world context further solidifies the model’s efficacy in capturing species’ spatial distribution. This research advances methodologies for integrating diverse datasets with different sampling natures further contributing to a more informed understanding of spatial dynamics of marine species.
{"title":"Joint model for zero-inflated data combining fishery-dependent and fishery-independent sources","authors":"Daniela Silva , Raquel Menezes , Gonçalo Araújo , Renato Rosa , Ana Moreno , Alexandra Silva , Susana Garrido","doi":"10.1016/j.spasta.2025.100930","DOIUrl":"10.1016/j.spasta.2025.100930","url":null,"abstract":"<div><div>Accurately identifying spatial patterns of species distribution is crucial for scientific insight and societal benefit, aiding our understanding of species fluctuations. The increasing quantity and quality of ecological datasets present heightened statistical challenges, complicating spatial species dynamics comprehension. Addressing the complex task of integrating multiple data sources to enhance spatial fish distribution understanding in marine ecology, this study introduces a pioneering five-layer Joint model. The model adeptly integrates fishery-independent and fishery-dependent data, accommodating zero-inflated data and distinct sampling processes. A comprehensive simulation study evaluates the model performance across various preferential sampling scenarios and sample sizes, elucidating its advantages and challenges. Our findings highlight the model’s robustness in estimating preferential parameters, emphasizing differentiation between presence–absence and biomass observations. Evaluation of estimation of spatial covariance and prediction performance underscores the model’s reliability. Augmenting sample sizes reduces parameter estimation variability, aligning with the principle that increased information enhances certainty. Assessing the contribution of each data source reveals successful integration, providing a comprehensive representation of biomass patterns. Empirical application within a real-world context further solidifies the model’s efficacy in capturing species’ spatial distribution. This research advances methodologies for integrating diverse datasets with different sampling natures further contributing to a more informed understanding of spatial dynamics of marine species.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100930"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-24DOI: 10.1016/j.spasta.2025.100933
Maria E. Kamenetsky , Jun Zhu , Ronald E. Gangnon
Patterns in disease across space and time are important to epidemiologists and health professionals because they may indicate underlying elevated disease risk. In some cases, elevated risk may be driven by environmental exposures, infectious diseases or other factors where timely public health interventions are important. The spatial and spatio-temporal scan statistics identify a single most likely cluster or equivalently select a single correct model. We instead consider an ensemble of single cluster models. We use stacking, a model-averaging technique, to combine relative risk estimates from all of the single cluster models into a sequence of meta-models indexed by the effective number of parameters/clusters. The number of parameters/spatio-temporal clusters is chosen using information criteria. A simulation study is conducted to demonstrate the statistical properties of the stacking method. The method is illustrated using a dataset of female breast cancer incidence data at the municipality level in Japan.
{"title":"Spatial and spatio-temporal cluster detection using stacking","authors":"Maria E. Kamenetsky , Jun Zhu , Ronald E. Gangnon","doi":"10.1016/j.spasta.2025.100933","DOIUrl":"10.1016/j.spasta.2025.100933","url":null,"abstract":"<div><div>Patterns in disease across space and time are important to epidemiologists and health professionals because they may indicate underlying elevated disease risk. In some cases, elevated risk may be driven by environmental exposures, infectious diseases or other factors where timely public health interventions are important. The spatial and spatio-temporal scan statistics identify a single most likely cluster or equivalently select a single correct model. We instead consider an ensemble of single cluster models. We use stacking, a model-averaging technique, to combine relative risk estimates from all of the single cluster models into a sequence of meta-models indexed by the effective number of parameters/clusters. The number of parameters/spatio-temporal clusters is chosen using information criteria. A simulation study is conducted to demonstrate the statistical properties of the stacking method. The method is illustrated using a dataset of female breast cancer incidence data at the municipality level in Japan.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100933"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-10DOI: 10.1016/j.spasta.2025.100929
Zhenhua Wang , Paul A. Parker , Scott H. Holan
Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay–Herriot model is one such approach that incorporates spatial dependence to improve estimation by borrowing strength from neighboring regions. However, this approach often requires substantial computational resources, limiting its scalability for high-dimensional datasets, especially when considering multiple (multivariate) responses. This paper proposes two methods that integrate the multivariate spatial Fay–Herriot model with spatial random effects, learned through variational autoencoders, to efficiently leverage spatial structure. Importantly, after training the variational autoencoder to represent spatial dependence for a given set of geographies, it may be used again in future modeling efforts, without the need for retraining. Additionally, the use of the variational autoencoder to represent spatial dependence results in extreme improvements in computational efficiency, even for massive datasets. We demonstrate the effectiveness of our approach using 5-year period estimates from the American Community Survey over all census tracts in California.
{"title":"Variational autoencoded multivariate spatial Fay–Herriot models","authors":"Zhenhua Wang , Paul A. Parker , Scott H. Holan","doi":"10.1016/j.spasta.2025.100929","DOIUrl":"10.1016/j.spasta.2025.100929","url":null,"abstract":"<div><div>Small area estimation models are essential for estimating population characteristics in regions with limited sample sizes, thereby supporting policy decisions, demographic studies, and resource allocation, among other use cases. The spatial Fay–Herriot model is one such approach that incorporates spatial dependence to improve estimation by borrowing strength from neighboring regions. However, this approach often requires substantial computational resources, limiting its scalability for high-dimensional datasets, especially when considering multiple (multivariate) responses. This paper proposes two methods that integrate the multivariate spatial Fay–Herriot model with spatial random effects, learned through variational autoencoders, to efficiently leverage spatial structure. Importantly, after training the variational autoencoder to represent spatial dependence for a given set of geographies, it may be used again in future modeling efforts, without the need for retraining. Additionally, the use of the variational autoencoder to represent spatial dependence results in extreme improvements in computational efficiency, even for massive datasets. We demonstrate the effectiveness of our approach using 5-year period estimates from the American Community Survey over all census tracts in California.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100929"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145050405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-09-23DOI: 10.1016/j.spasta.2025.100931
Aibing Ji, Jingxuan Li, Qingqing Li
The spatial autoregressive panel data models are widely employed in regional economics to capture spatial dependencies, but conventional specifications rely on a single spatial weight matrix, heightening the risk of model misspecification. Current research lacks systematic model averaging methods for integrating multiple weight matrices and addressing spatial effect uncertainty. This study proposes a novel model averaging framework for spatial autoregressive panel data models with fixed effects, extending model averaging methodology to the spatial panel context and enabling flexible combinations of multiple weight matrices for both dependent variables and error terms. An adaptive Mallows-type criterion is developed, dynamically adjusting to the presence or absence of spatial effects, with its asymptotic optimality established. Monte Carlo simulations confirm robustness across scenarios with no, single, or mixed spatial dependencies. An empirical application to Chinese provincial housing prices identifies economic adjacency as the key spatial dependence driver, validating the method’s predictive accuracy and policy utility for spatiotemporal data analysis.
{"title":"Model averaging for spatial autoregressive panel data models","authors":"Aibing Ji, Jingxuan Li, Qingqing Li","doi":"10.1016/j.spasta.2025.100931","DOIUrl":"10.1016/j.spasta.2025.100931","url":null,"abstract":"<div><div>The spatial autoregressive panel data models are widely employed in regional economics to capture spatial dependencies, but conventional specifications rely on a single spatial weight matrix, heightening the risk of model misspecification. Current research lacks systematic model averaging methods for integrating multiple weight matrices and addressing spatial effect uncertainty. This study proposes a novel model averaging framework for spatial autoregressive panel data models with fixed effects, extending model averaging methodology to the spatial panel context and enabling flexible combinations of multiple weight matrices for both dependent variables and error terms. An adaptive Mallows-type criterion is developed, dynamically adjusting to the presence or absence of spatial effects, with its asymptotic optimality established. Monte Carlo simulations confirm robustness across scenarios with no, single, or mixed spatial dependencies. An empirical application to Chinese provincial housing prices identifies economic adjacency as the key spatial dependence driver, validating the method’s predictive accuracy and policy utility for spatiotemporal data analysis.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"70 ","pages":"Article 100931"},"PeriodicalIF":2.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}