Pub Date : 2026-04-01Epub Date: 2026-01-17DOI: 10.1016/j.spasta.2026.100955
Yongqi Wang, Yunquan Song
Transfer learning is a machine learning approach that enhances target domain performance by leveraging knowledge from source domains. Although this method has been widely applied in regression problems, research remains limited for scenarios involving partially missing response data in the target domain. This study addresses the dual challenges of missing responses and small sample sizes in spatially dependent regression problems by proposing an EM algorithm-based transfer learning framework. The framework first employs the EM algorithm to handle missing responses in spatial autoregressive models, then develops a two-step transfer learning method for known source domains, along with a cross-validation-based detection algorithm for unknown transferable sources. Numerical simulations demonstrate that the proposed methods exhibit superior performance in both parameter estimation accuracy and model robustness.
{"title":"Transfer learning for spatial autoregressive models with missing responses","authors":"Yongqi Wang, Yunquan Song","doi":"10.1016/j.spasta.2026.100955","DOIUrl":"10.1016/j.spasta.2026.100955","url":null,"abstract":"<div><div>Transfer learning is a machine learning approach that enhances target domain performance by leveraging knowledge from source domains. Although this method has been widely applied in regression problems, research remains limited for scenarios involving partially missing response data in the target domain. This study addresses the dual challenges of missing responses and small sample sizes in spatially dependent regression problems by proposing an EM algorithm-based transfer learning framework. The framework first employs the EM algorithm to handle missing responses in spatial autoregressive models, then develops a two-step transfer learning method for known source domains, along with a cross-validation-based detection algorithm for unknown transferable sources. Numerical simulations demonstrate that the proposed methods exhibit superior performance in both parameter estimation accuracy and model robustness.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100955"},"PeriodicalIF":2.5,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.spasta.2025.100948
Paritosh Kumar Roy , Alexandra M. Schmidt
Environmental data commonly involves measuring multiple pollutants, such as and levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software Stan. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of and on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.
{"title":"A heavy-tailed model for multivariate spatial processes","authors":"Paritosh Kumar Roy , Alexandra M. Schmidt","doi":"10.1016/j.spasta.2025.100948","DOIUrl":"10.1016/j.spasta.2025.100948","url":null,"abstract":"<div><div>Environmental data commonly involves measuring multiple pollutants, such as <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software <span>Stan</span>. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100948"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-20DOI: 10.1016/j.spasta.2025.100944
Garazi Retegui, Jaione Etxeberria, María Dolores Ugarte
Cancer data, particularly cancer incidence and mortality, are fundamental to understand the cancer burden, to set targets for cancer control and to evaluate the evolution of the implementation of a cancer control policy. However, the complexity of data collection, classification, validation and processing result in cancer incidence figures often lagging two to three years behind the calendar year. In response, national or regional population-based cancer registries (PBCRs) are increasingly interested in methods for forecasting cancer incidence. However, in many countries there is an additional difficulty in projecting cancer incidence as regional registries are usually not established in the same year and therefore cancer incidence data series between different regions of a country are not harmonized over time. This study addresses the challenge of forecasting cancer incidence with incomplete data at both regional and national levels. To achieve this, we propose the use of multivariate spatio-temporal shared component models that jointly model mortality data and available cancer incidence data. We evaluate the performance of these multivariate models using lung cancer incidence data and the corresponding number of deaths reported in England for the period 2001–2019. Model performance was assessed using different predictive measures to select the best model.
{"title":"Multivariate spatio-temporal modelling for completing cancer registries and forecasting incidence","authors":"Garazi Retegui, Jaione Etxeberria, María Dolores Ugarte","doi":"10.1016/j.spasta.2025.100944","DOIUrl":"10.1016/j.spasta.2025.100944","url":null,"abstract":"<div><div>Cancer data, particularly cancer incidence and mortality, are fundamental to understand the cancer burden, to set targets for cancer control and to evaluate the evolution of the implementation of a cancer control policy. However, the complexity of data collection, classification, validation and processing result in cancer incidence figures often lagging two to three years behind the calendar year. In response, national or regional population-based cancer registries (PBCRs) are increasingly interested in methods for forecasting cancer incidence. However, in many countries there is an additional difficulty in projecting cancer incidence as regional registries are usually not established in the same year and therefore cancer incidence data series between different regions of a country are not harmonized over time. This study addresses the challenge of forecasting cancer incidence with incomplete data at both regional and national levels. To achieve this, we propose the use of multivariate spatio-temporal shared component models that jointly model mortality data and available cancer incidence data. We evaluate the performance of these multivariate models using lung cancer incidence data and the corresponding number of deaths reported in England for the period 2001–2019. Model performance was assessed using different predictive measures to select the best model.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100944"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-16DOI: 10.1016/j.spasta.2025.100950
Bruno Bracalente , Antonio Forcina , e Nicola Falocci
Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.
{"title":"Determinants of vote transitions by ecological inference within small areas","authors":"Bruno Bracalente , Antonio Forcina , e Nicola Falocci","doi":"10.1016/j.spasta.2025.100950","DOIUrl":"10.1016/j.spasta.2025.100950","url":null,"abstract":"<div><div>Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100950"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-15DOI: 10.1016/j.spasta.2025.100940
Alba Bernabeu , Claudio Fronterrè , Jorge Mateu
Spatio-temporal point pattern data often exhibit clustering, which may arise either from self-exciting mechanisms or from latent environmental heterogeneity. Hawkes processes and log-Gaussian Cox processes (LGCPs) represent two widely used but fundamentally different modelling approaches for capturing such patterns. While Hawkes processes assume event-driven triggering, LGCPs model clustering through a latent Gaussian random field acting on the intensity function. In practice, model selection between these alternatives is rarely conducted rigorously, and second-order characteristics are often insufficient to discriminate between them. We present a simulation-based comparative study that systematically evaluates estimation, second-order structure, and predictive performance under model misspecification. In particular, we assess the ability of each model to reproduce the dynamics of data generated under the competing framework. We also analyse real crime data to illustrate how inference and prediction are affected by model choice. Our results underscore the interpretive consequences of model misspecification and highlight key diagnostic limitations when disentangling clustering mechanisms in spatio-temporal processes.
{"title":"Misspecification issues between competitive spatio-temporal cluster point processes","authors":"Alba Bernabeu , Claudio Fronterrè , Jorge Mateu","doi":"10.1016/j.spasta.2025.100940","DOIUrl":"10.1016/j.spasta.2025.100940","url":null,"abstract":"<div><div>Spatio-temporal point pattern data often exhibit clustering, which may arise either from self-exciting mechanisms or from latent environmental heterogeneity. Hawkes processes and log-Gaussian Cox processes (LGCPs) represent two widely used but fundamentally different modelling approaches for capturing such patterns. While Hawkes processes assume event-driven triggering, LGCPs model clustering through a latent Gaussian random field acting on the intensity function. In practice, model selection between these alternatives is rarely conducted rigorously, and second-order characteristics are often insufficient to discriminate between them. We present a simulation-based comparative study that systematically evaluates estimation, second-order structure, and predictive performance under model misspecification. In particular, we assess the ability of each model to reproduce the dynamics of data generated under the competing framework. We also analyse real crime data to illustrate how inference and prediction are affected by model choice. Our results underscore the interpretive consequences of model misspecification and highlight key diagnostic limitations when disentangling clustering mechanisms in spatio-temporal processes.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100940"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-29DOI: 10.1016/j.spasta.2025.100947
S. Emili, F. Galli
In this paper, we develop a system of simultaneous stochastic frontier models with inefficiency determinants, spatio-temporal effects and correlated inefficiency as well as correlated random errors among frontiers. The dependence among the errors of the different equations can stem from either shocks external to the system, interrelated inefficiency mechanisms, or a combination of both. Estimation is performed using a copula-based quasi-maximum likelihood approach. Simulation results confirm the good finite sample properties of the proposed estimator. To demonstrate the effectiveness of the proposed model and estimation technique in empirical settings, we analyse the key role of some sustainability-related factors in determining the efficiency level of Italian cultural and creative sectors.
{"title":"A simultaneous system of dynamic spatial stochastic frontier models with dependent error components and inefficiency determinants","authors":"S. Emili, F. Galli","doi":"10.1016/j.spasta.2025.100947","DOIUrl":"10.1016/j.spasta.2025.100947","url":null,"abstract":"<div><div>In this paper, we develop a system of simultaneous stochastic frontier models with inefficiency determinants, spatio-temporal effects and correlated inefficiency as well as correlated random errors among frontiers. The dependence among the errors of the different equations can stem from either shocks external to the system, interrelated inefficiency mechanisms, or a combination of both. Estimation is performed using a copula-based quasi-maximum likelihood approach. Simulation results confirm the good finite sample properties of the proposed estimator. To demonstrate the effectiveness of the proposed model and estimation technique in empirical settings, we analyse the key role of some sustainability-related factors in determining the efficiency level of Italian cultural and creative sectors.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100947"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-12DOI: 10.1016/j.spasta.2025.100942
Yining Han , Xiaobin Chen , Xingxing Huang , Zhongyin Liu
The integration of multi-source data often results in the occurrence of local over-densities at point locations. These over-densities can bias the estimation of spatial statistical parameters, such as mean and variance, compromise the quality of variogram fitting, and degrade the accuracy of interpolation results. However, a standardized approach for identifying local over-densities in spatial datasets is currently lacking. In this paper, we propose three parameters—self-sparsity, mutual sparsity, and small-distance variability—and construct a three-parameter comprehensive cross-plot to facilitate the visual identification of local over-densities within spatial data points, thereby enabling further processing. Using both synthetic datasets generated via stochastic process simulations and real-world datasets, we demonstrate that the proposed three-parameter comprehensive cross-plot, based on self-sparsity, mutual sparsity, and small-distance variability, effectively identifies local over-densities in spatial datasets. Furthermore, by appropriately processing these over-densities, the accuracy of spatial statistical parameter estimation can be enhanced, a more reliable theoretical variogram model can be established, and both spatial statistical analysis and interpolation results can ultimately be improved.
{"title":"A visual method for identifying local over-densities in spatial data","authors":"Yining Han , Xiaobin Chen , Xingxing Huang , Zhongyin Liu","doi":"10.1016/j.spasta.2025.100942","DOIUrl":"10.1016/j.spasta.2025.100942","url":null,"abstract":"<div><div>The integration of multi-source data often results in the occurrence of local over-densities at point locations. These over-densities can bias the estimation of spatial statistical parameters, such as mean and variance, compromise the quality of variogram fitting, and degrade the accuracy of interpolation results. However, a standardized approach for identifying local over-densities in spatial datasets is currently lacking. In this paper, we propose three parameters—self-sparsity, mutual sparsity, and small-distance variability—and construct a three-parameter comprehensive cross-plot to facilitate the visual identification of local over-densities within spatial data points, thereby enabling further processing. Using both synthetic datasets generated via stochastic process simulations and real-world datasets, we demonstrate that the proposed three-parameter comprehensive cross-plot, based on self-sparsity, mutual sparsity, and small-distance variability, effectively identifies local over-densities in spatial datasets. Furthermore, by appropriately processing these over-densities, the accuracy of spatial statistical parameter estimation can be enhanced, a more reliable theoretical variogram model can be established, and both spatial statistical analysis and interpolation results can ultimately be improved.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100942"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-21DOI: 10.1016/j.spasta.2025.100946
Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen
Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages inlabru and MetricGraph, and new code for processing geographical line data to work with existing graph representations and fmesher methods for dealing with line support in inlabru on objects from MetricGraph. We fit the model to two datasets where we expect different spatial dependency and compare the results.
{"title":"Joint modeling of line and point data on metric graphs","authors":"Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen","doi":"10.1016/j.spasta.2025.100946","DOIUrl":"10.1016/j.spasta.2025.100946","url":null,"abstract":"<div><div>Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages <span>inlabru</span> and <span>MetricGraph</span>, and new code for processing geographical line data to work with existing graph representations and <span>fmesher</span> methods for dealing with line support in <span>inlabru</span> on objects from <span>MetricGraph</span>. We fit the model to two datasets where we expect different spatial dependency and compare the results.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100946"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-11-21DOI: 10.1016/j.spasta.2025.100945
Qing Zhao , Yunyi Shen
Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.
{"title":"Explicit modeling of density dependence in spatial capture-recapture models","authors":"Qing Zhao , Yunyi Shen","doi":"10.1016/j.spasta.2025.100945","DOIUrl":"10.1016/j.spasta.2025.100945","url":null,"abstract":"<div><div>Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100945"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-16DOI: 10.1016/j.spasta.2025.100949
Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan
In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package smile, facilitating reproducible spatial analyses under misalignment.
{"title":"Voronoi linkage between mismatching voting stations and census tracts in analyzing the 2018 Brazilian presidential election data","authors":"Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan","doi":"10.1016/j.spasta.2025.100949","DOIUrl":"10.1016/j.spasta.2025.100949","url":null,"abstract":"<div><div>In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package <span>smile</span>, facilitating reproducible spatial analyses under misalignment.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100949"},"PeriodicalIF":2.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}