The growing accessibility of Light Detection and Ranging (LiDAR) data brings out novel perspectives that are crucial for tracking forest growth and enhancing resource management amid climate change. Utilizing these data to propose decision-support tools involves a vital step of segmenting individual trees. A widely adopted class of methods for this step is known as Local Maxima algorithms, which, although unsupervised, rely on per-site and/or per-species hyperparameter tuning for optimal performance. In this work, we introduce a novel methodological framework grounded in point process theory to jointly model the data generation process and provide formal implementation guidelines for refining window size selection within the class of Local Maxima algorithms. This methodology can also be applied to incomplete plot measurements, alleviating a constraint noted in most data acquisition procedures. To ensure the reproducibility of the results and validate the practical application, we apply the proposed methodology in two cases: (i) a simulated dataset (made publicly available) and (ii) an open real dataset. The simulation study evaluates performance under spatial configurations that do not necessarily follow the assumed point process model used for window calibration, thereby assessing robustness to model misspecification. The method outperforms the baseline approaches in the simulation study for the detection task, and achieves -scores between 55% and 90% on real data. On average, it improves upon the second-best method by about 4%, with performance depending on a tree’s position within the canopy relative to its neighbors.
{"title":"Adaptive local maxima windows for tree detection: A point process perspective","authors":"Konstantinos Florakis , Véronique Letort , Raphaël Canals , Gilles Faÿ , Samis Trevezas","doi":"10.1016/j.spasta.2026.100954","DOIUrl":"10.1016/j.spasta.2026.100954","url":null,"abstract":"<div><div>The growing accessibility of Light Detection and Ranging (LiDAR) data brings out novel perspectives that are crucial for tracking forest growth and enhancing resource management amid climate change. Utilizing these data to propose decision-support tools involves a vital step of segmenting individual trees. A widely adopted class of methods for this step is known as Local Maxima algorithms, which, although unsupervised, rely on per-site and/or per-species hyperparameter tuning for optimal performance. In this work, we introduce a novel methodological framework grounded in point process theory to jointly model the data generation process and provide formal implementation guidelines for refining window size selection within the class of Local Maxima algorithms. This methodology can also be applied to incomplete plot measurements, alleviating a constraint noted in most data acquisition procedures. To ensure the reproducibility of the results and validate the practical application, we apply the proposed methodology in two cases: (i) a simulated dataset (made publicly available) and (ii) an open real dataset. The simulation study evaluates performance under spatial configurations that do not necessarily follow the assumed point process model used for window calibration, thereby assessing robustness to model misspecification. The method outperforms the baseline approaches in the simulation study for the detection task, and achieves <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-scores between 55% and 90% on real data. On average, it improves upon the second-best method by about 4%, with performance depending on a tree’s position within the canopy relative to its neighbors.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100954"},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.spasta.2026.100953
Youhua Chen , Tsung-Jen Shen
A common practice in ecological and biodiversity research for estimating local species diversity levels is to integrate both a regional species abundance distribution model and a spatial distributional aggregation model of species. In this study, we argue that the inclusion of a species-specific spatial aggregation model is unnecessary in many cases because the regional species abundance distribution model can be directly transformed into a local species abundance distribution model to estimate local species richness and diversity levels. We support this claim by extensively investigating varying-scale species-area relation (SAR) patterns through a spatially explicit semi-empirical test on a fully censused forest plot, considering various spatial sampling scenarios. When local spatial sampling is randomly conducted with small or moderate operative sampling units (i.e., quadrats), estimated species richness closely matches theoretical expectations for the SAR curve (i.e., SAR rarefaction curve including both interpolation and extrapolation), as the corresponding confidence intervals consistently covered the true values. However, during the extrapolation process (i.e., spatially sample a local proportion of the forest plot and estimate species richness at a larger proportion of the plot), estimates sometimes tend to underestimate species richness when local spatial sampling was conducted using large quadrats or a single contiguous region, likely due to the effect of spatial autocorrelation. However, contiguous area sampling becomes challenging wen the single area covers natural barriers such as rivers or steep terrain in macro-ecological and spatial ecology research. By contrast, ecologists typically rely on information collected from many small-sized sampling plots for conducting biodiversity inference. To this end, in the field practice, local spatial sampling, or more specifically, the integration of spatial distributional aggregation model of species for biodiversity level estimation, was actually unnecessary in most cases. In conclusion, as long as ecologists can implement spatially random and unconstrained sampling, the two-step modeling approach is falsified, tending to create potentially misleading conclusions on diversity estimation and extinction risk assessments. Nonetheless, the local spatial aggregation model can still be helpful when large portions of the study region are inaccessible or when the local sampling cann't be conducted freely and randomly in space. A computational R package for estimating and plotting SAR with unconditional variance calculation is available at the following URL: https://zenodo.org/records/14821773.
{"title":"Estimating and plotting species-area relationship: Does aggregate distribution of species really matter?","authors":"Youhua Chen , Tsung-Jen Shen","doi":"10.1016/j.spasta.2026.100953","DOIUrl":"10.1016/j.spasta.2026.100953","url":null,"abstract":"<div><div>A common practice in ecological and biodiversity research for estimating local species diversity levels is to integrate both a regional species abundance distribution model and a spatial distributional aggregation model of species. In this study, we argue that the inclusion of a species-specific spatial aggregation model is unnecessary in many cases because the regional species abundance distribution model can be directly transformed into a local species abundance distribution model to estimate local species richness and diversity levels. We support this claim by extensively investigating varying-scale species-area relation (SAR) patterns through a spatially explicit semi-empirical test on a fully censused forest plot, considering various spatial sampling scenarios. When local spatial sampling is randomly conducted with small or moderate operative sampling units (i.e., quadrats), estimated species richness closely matches theoretical expectations for the SAR curve (i.e., SAR rarefaction curve including both interpolation and extrapolation), as the corresponding confidence intervals consistently covered the true values. However, during the extrapolation process (i.e., spatially sample a local proportion of the forest plot and estimate species richness at a larger proportion of the plot), estimates sometimes tend to underestimate species richness when local spatial sampling was conducted using large quadrats or a single contiguous region, likely due to the effect of spatial autocorrelation. However, contiguous area sampling becomes challenging wen the single area covers natural barriers such as rivers or steep terrain in macro-ecological and spatial ecology research. By contrast, ecologists typically rely on information collected from many small-sized sampling plots for conducting biodiversity inference. To this end, in the field practice, local spatial sampling, or more specifically, the integration of spatial distributional aggregation model of species for biodiversity level estimation, was actually unnecessary in most cases. In conclusion, as long as ecologists can implement spatially random and unconstrained sampling, the two-step modeling approach is falsified, tending to create potentially misleading conclusions on diversity estimation and extinction risk assessments. Nonetheless, the local spatial aggregation model can still be helpful when large portions of the study region are inaccessible or when the local sampling cann't be conducted freely and randomly in space. A computational R package for estimating and plotting SAR with unconditional variance calculation is available at the following URL: <span><span>https://zenodo.org/records/14821773</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100953"},"PeriodicalIF":2.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.spasta.2025.100952
Jong Hyeon Lee , Jongmin Kim , Heesang Lee , Jaewoo Park
A large class of spatial models contains intractable normalizing functions, such as spatial lattice models, interaction spatial point processes, and social network models. Bayesian inference for such models is challenging since the resulting posterior distribution is doubly intractable. Although auxiliary variable MCMC (AVM) algorithms are known to be the most practical, they are computationally expensive due to the repeated auxiliary variable simulations. To address this, we propose delayed-acceptance AVM (DA-AVM) methods, which can reduce the number of auxiliary variable simulations. The first stage of the kernel uses a cheap surrogate to decide whether to accept or reject the proposed parameter value. The second stage guarantees detailed balance with respect to the posterior. The auxiliary variable simulation is performed only on the parameters accepted in the first stage. We construct various surrogates specifically tailored for doubly intractable problems, including subsampling strategy, Gaussian process emulation, and frequentist estimator-based approximation. We validate our method through simulated and real data applications, demonstrating its practicality for complex spatial models.
{"title":"A delayed acceptance auxiliary variable MCMC for spatial models with intractable likelihood function","authors":"Jong Hyeon Lee , Jongmin Kim , Heesang Lee , Jaewoo Park","doi":"10.1016/j.spasta.2025.100952","DOIUrl":"10.1016/j.spasta.2025.100952","url":null,"abstract":"<div><div>A large class of spatial models contains intractable normalizing functions, such as spatial lattice models, interaction spatial point processes, and social network models. Bayesian inference for such models is challenging since the resulting posterior distribution is doubly intractable. Although auxiliary variable MCMC (AVM) algorithms are known to be the most practical, they are computationally expensive due to the repeated auxiliary variable simulations. To address this, we propose delayed-acceptance AVM (DA-AVM) methods, which can reduce the number of auxiliary variable simulations. The first stage of the kernel uses a cheap surrogate to decide whether to accept or reject the proposed parameter value. The second stage guarantees detailed balance with respect to the posterior. The auxiliary variable simulation is performed only on the parameters accepted in the first stage. We construct various surrogates specifically tailored for doubly intractable problems, including subsampling strategy, Gaussian process emulation, and frequentist estimator-based approximation. We validate our method through simulated and real data applications, demonstrating its practicality for complex spatial models.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100952"},"PeriodicalIF":2.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.spasta.2025.100951
Soyun Jeon , Jungsoon Choi
Elevated levels of PM10 are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM10 concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM10 concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM10 and environmental factors can further influence the variability in PM10. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM10 by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM10 data for South Korea in 2019, using PM2.5, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM10 exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.
{"title":"Copula-based spatio-temporal modeling of air pollutant data incorporating covariate dependence","authors":"Soyun Jeon , Jungsoon Choi","doi":"10.1016/j.spasta.2025.100951","DOIUrl":"10.1016/j.spasta.2025.100951","url":null,"abstract":"<div><div>Elevated levels of PM<sub>10</sub> are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM<sub>10</sub> concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM<sub>10</sub> concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM<sub>10</sub> and environmental factors can further influence the variability in PM<sub>10</sub>. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM<sub>10</sub> by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM<sub>10</sub> data for South Korea in 2019, using PM<sub>2.5</sub>, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM<sub>10</sub> exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100951"},"PeriodicalIF":2.5,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-16DOI: 10.1016/j.spasta.2025.100950
Bruno Bracalente , Antonio Forcina , e Nicola Falocci
Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.
{"title":"Determinants of vote transitions by ecological inference within small areas","authors":"Bruno Bracalente , Antonio Forcina , e Nicola Falocci","doi":"10.1016/j.spasta.2025.100950","DOIUrl":"10.1016/j.spasta.2025.100950","url":null,"abstract":"<div><div>Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100950"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-16DOI: 10.1016/j.spasta.2025.100949
Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan
In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package smile, facilitating reproducible spatial analyses under misalignment.
{"title":"Voronoi linkage between mismatching voting stations and census tracts in analyzing the 2018 Brazilian presidential election data","authors":"Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan","doi":"10.1016/j.spasta.2025.100949","DOIUrl":"10.1016/j.spasta.2025.100949","url":null,"abstract":"<div><div>In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package <span>smile</span>, facilitating reproducible spatial analyses under misalignment.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100949"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1016/j.spasta.2025.100948
Paritosh Kumar Roy , Alexandra M. Schmidt
Environmental data commonly involves measuring multiple pollutants, such as and levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software Stan. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of and on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.
{"title":"A heavy-tailed model for multivariate spatial processes","authors":"Paritosh Kumar Roy , Alexandra M. Schmidt","doi":"10.1016/j.spasta.2025.100948","DOIUrl":"10.1016/j.spasta.2025.100948","url":null,"abstract":"<div><div>Environmental data commonly involves measuring multiple pollutants, such as <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software <span>Stan</span>. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100948"},"PeriodicalIF":2.5,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1016/j.spasta.2025.100947
S. Emili, F. Galli
In this paper, we develop a system of simultaneous stochastic frontier models with inefficiency determinants, spatio-temporal effects and correlated inefficiency as well as correlated random errors among frontiers. The dependence among the errors of the different equations can stem from either shocks external to the system, interrelated inefficiency mechanisms, or a combination of both. Estimation is performed using a copula-based quasi-maximum likelihood approach. Simulation results confirm the good finite sample properties of the proposed estimator. To demonstrate the effectiveness of the proposed model and estimation technique in empirical settings, we analyse the key role of some sustainability-related factors in determining the efficiency level of Italian cultural and creative sectors.
{"title":"A simultaneous system of dynamic spatial stochastic frontier models with dependent error components and inefficiency determinants","authors":"S. Emili, F. Galli","doi":"10.1016/j.spasta.2025.100947","DOIUrl":"10.1016/j.spasta.2025.100947","url":null,"abstract":"<div><div>In this paper, we develop a system of simultaneous stochastic frontier models with inefficiency determinants, spatio-temporal effects and correlated inefficiency as well as correlated random errors among frontiers. The dependence among the errors of the different equations can stem from either shocks external to the system, interrelated inefficiency mechanisms, or a combination of both. Estimation is performed using a copula-based quasi-maximum likelihood approach. Simulation results confirm the good finite sample properties of the proposed estimator. To demonstrate the effectiveness of the proposed model and estimation technique in empirical settings, we analyse the key role of some sustainability-related factors in determining the efficiency level of Italian cultural and creative sectors.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100947"},"PeriodicalIF":2.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1016/j.spasta.2025.100946
Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen
Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages inlabru and MetricGraph, and new code for processing geographical line data to work with existing graph representations and fmesher methods for dealing with line support in inlabru on objects from MetricGraph. We fit the model to two datasets where we expect different spatial dependency and compare the results.
{"title":"Joint modeling of line and point data on metric graphs","authors":"Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen","doi":"10.1016/j.spasta.2025.100946","DOIUrl":"10.1016/j.spasta.2025.100946","url":null,"abstract":"<div><div>Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages <span>inlabru</span> and <span>MetricGraph</span>, and new code for processing geographical line data to work with existing graph representations and <span>fmesher</span> methods for dealing with line support in <span>inlabru</span> on objects from <span>MetricGraph</span>. We fit the model to two datasets where we expect different spatial dependency and compare the results.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100946"},"PeriodicalIF":2.5,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1016/j.spasta.2025.100945
Qing Zhao , Yunyi Shen
Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.
{"title":"Explicit modeling of density dependence in spatial capture-recapture models","authors":"Qing Zhao , Yunyi Shen","doi":"10.1016/j.spasta.2025.100945","DOIUrl":"10.1016/j.spasta.2025.100945","url":null,"abstract":"<div><div>Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100945"},"PeriodicalIF":2.5,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}