Pub Date : 2026-02-02DOI: 10.1016/j.spasta.2026.100962
Chaoyi Lu, Nial Friel
There is increasing interest to develop Bayesian inferential algorithms for point process models with intractable likelihoods. A purpose of this paper is to illustrate the utility of using simulation based strategies, including Approximate Bayesian Computation (ABC) and Markov Chain Monte Carlo (MCMC) methods for this task. Shirota and Gelfand (2017) proposed an extended version of an ABC approach for Repulsive Spatial Point Processes (RSPP), but their algorithm was not correctly detailed. In this paper, we correct their method and, based on this, we propose a new ABC-MCMC algorithm to which Markov property is introduced compared to a typical ABC method. Though it is generally impractical to use, Monte Carlo approximations can be leveraged for intractable terms. Another aspect of this paper is to explore the use of the exchange algorithm and the noisy Metropolis–Hastings algorithm (Alquier et al., 2016) on RSPP. Comparisons to ABC-MCMC methods are also provided. We find that the inferential approaches outlined above yield good performance for RSPP in both simulated and real data applications and should be considered as viable approaches for the analysis of these models.
对具有难处理似然的点过程模型开发贝叶斯推理算法的兴趣越来越大。本文的目的是说明使用基于模拟的策略的效用,包括近似贝叶斯计算(ABC)和马尔可夫链蒙特卡罗(MCMC)方法来完成这项任务。Shirota和Gelfand(2017)提出了排斥空间点过程(RSPP) ABC方法的扩展版本,但他们的算法没有正确详细说明。在此基础上,我们提出了一种新的ABC- mcmc算法,与典型的ABC方法相比,该算法引入了马尔可夫性质。尽管使用蒙特卡罗近似通常是不切实际的,但它可以用于棘手的项。本文的另一个方面是探索在RSPP上使用交换算法和带噪声的Metropolis-Hastings算法(Alquier et al., 2016)。并与ABC-MCMC方法进行了比较。我们发现,上述推理方法在模拟和实际数据应用中都为RSPP产生了良好的性能,应该被认为是分析这些模型的可行方法。
{"title":"Bayesian strategies for repulsive spatial point processes","authors":"Chaoyi Lu, Nial Friel","doi":"10.1016/j.spasta.2026.100962","DOIUrl":"10.1016/j.spasta.2026.100962","url":null,"abstract":"<div><div>There is increasing interest to develop Bayesian inferential algorithms for point process models with intractable likelihoods. A purpose of this paper is to illustrate the utility of using simulation based strategies, including Approximate Bayesian Computation (ABC) and Markov Chain Monte Carlo (MCMC) methods for this task. <span><span>Shirota and Gelfand (2017)</span></span> proposed an extended version of an ABC approach for Repulsive Spatial Point Processes (RSPP), but their algorithm was not correctly detailed. In this paper, we correct their method and, based on this, we propose a new ABC-MCMC algorithm to which Markov property is introduced compared to a typical ABC method. Though it is generally impractical to use, Monte Carlo approximations can be leveraged for intractable terms. Another aspect of this paper is to explore the use of the exchange algorithm and the noisy Metropolis–Hastings algorithm (<span><span>Alquier et al., 2016</span></span>) on RSPP. Comparisons to ABC-MCMC methods are also provided. We find that the inferential approaches outlined above yield good performance for RSPP in both simulated and real data applications and should be considered as viable approaches for the analysis of these models.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"73 ","pages":"Article 100962"},"PeriodicalIF":2.5,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146098758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1016/j.spasta.2026.100959
Vivian Yi-Ju Chen, Yi-Jin Li
Geographically weighted regression (GWR) has been actively extended to accommodate count outcomes, yet existing approaches typically rely on restrictive distributional assumptions (e.g., Poisson, negative binomial) or two-part mixtures (e.g., zero-inflated models) that complicate estimation and interpretation. In this study, we propose a geographically weighted Poisson–Tweedie model (GWPTM), which integrates the Poisson–Tweedie distribution family into the GWR framework to provide a flexible approach for spatial count data analysis. By specifying variance as a power function of the mean, GWPTM unifies Poisson, negative binomial, and related count processes within a single-stage framework. This enables the model to naturally account for a broad spectrum of dispersion patterns as well as excess zeros and tail behavior, while allowing both regression coefficients and distributional parameters to vary across space. We develop an estimating function approach for local parameter estimation and inference. Simulation studies show that GWPTM accurately recovers spatially varying relationships, adapts effectively to heterogeneous dispersion patterns, and exhibits competitive performance against benchmark methods as well as favorable finite-sample behavior. An application to Taiwan dengue fever data further illustrates the practical advantages of GWPTM, which achieves superior explanatory and predictive performance and reveals pronounced spatial nonstationarity in both covariate effects and distributional characteristics that competing methods fail to capture. Overall, the proposed GWPTM offers a useful and parsimonious framework for analyzing spatially heterogeneous count data.
{"title":"Geographically weighted Poisson–Tweedie model for count data","authors":"Vivian Yi-Ju Chen, Yi-Jin Li","doi":"10.1016/j.spasta.2026.100959","DOIUrl":"10.1016/j.spasta.2026.100959","url":null,"abstract":"<div><div>Geographically weighted regression (GWR) has been actively extended to accommodate count outcomes, yet existing approaches typically rely on restrictive distributional assumptions (e.g., Poisson, negative binomial) or two-part mixtures (e.g., zero-inflated models) that complicate estimation and interpretation. In this study, we propose a geographically weighted Poisson–Tweedie model (GWPTM), which integrates the Poisson–Tweedie distribution family into the GWR framework to provide a flexible approach for spatial count data analysis. By specifying variance as a power function of the mean, GWPTM unifies Poisson, negative binomial, and related count processes within a single-stage framework. This enables the model to naturally account for a broad spectrum of dispersion patterns as well as excess zeros and tail behavior, while allowing both regression coefficients and distributional parameters to vary across space. We develop an estimating function approach for local parameter estimation and inference. Simulation studies show that GWPTM accurately recovers spatially varying relationships, adapts effectively to heterogeneous dispersion patterns, and exhibits competitive performance against benchmark methods as well as favorable finite-sample behavior. An application to Taiwan dengue fever data further illustrates the practical advantages of GWPTM, which achieves superior explanatory and predictive performance and reveals pronounced spatial nonstationarity in both covariate effects and distributional characteristics that competing methods fail to capture. Overall, the proposed GWPTM offers a useful and parsimonious framework for analyzing spatially heterogeneous count data.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100959"},"PeriodicalIF":2.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146078454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.spasta.2026.100958
Ryan Cotsakis , Elena Di Bernardino , Thomas Opitz
Extreme events arising in georeferenced stochastic processes can take various forms, such as occurring in isolated patches or stretching contiguously over large areas, and can further vary with the spatial location and the extremeness of the events. We use excursion sets above threshold exceedances in data observed over a two-dimensional grid of rectangular pixels to propose a general family of coefficients that assess spatial-extent properties relevant for risk assessment, and study five candidate coefficients from this family. These coefficients are defined locally and interpreted as a spatial distance from a reference site where the threshold is exceeded. We develop statistical inference and discuss robustness to boundary effects and resolution of the pixel grid. To statistically extrapolate coefficients towards very high threshold levels, we formulate an asymptotically motivated semiparametric model and estimate a parameter characterizing how coefficients scale with the quantile level of the threshold. The utility of the new coefficients is illustrated through simulated data, as well as in an application to gridded daily temperature in continental France.
{"title":"Assessing the size of spatial extreme events using local coefficients based on excursion sets","authors":"Ryan Cotsakis , Elena Di Bernardino , Thomas Opitz","doi":"10.1016/j.spasta.2026.100958","DOIUrl":"10.1016/j.spasta.2026.100958","url":null,"abstract":"<div><div>Extreme events arising in georeferenced stochastic processes can take various forms, such as occurring in isolated patches or stretching contiguously over large areas, and can further vary with the spatial location and the extremeness of the events. We use excursion sets above threshold exceedances in data observed over a two-dimensional grid of rectangular pixels to propose a general family of coefficients that assess spatial-extent properties relevant for risk assessment, and study five candidate coefficients from this family. These coefficients are defined locally and interpreted as a spatial distance from a reference site where the threshold is exceeded. We develop statistical inference and discuss robustness to boundary effects and resolution of the pixel grid. To statistically extrapolate coefficients towards very high threshold levels, we formulate an asymptotically motivated semiparametric model and estimate a parameter characterizing how coefficients scale with the quantile level of the threshold. The utility of the new coefficients is illustrated through simulated data, as well as in an application to gridded daily temperature in continental France.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100958"},"PeriodicalIF":2.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146090176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.spasta.2026.100957
Yirao Zhang , Rob Deardon , Lorna Deeth
Individual-level models, also known as ILMs, are commonly used in epidemics modelling, as they can flexibly incorporate individual-level covariates that influence susceptibility and transmissibility upon infection. However, inference for ILMs is computationally intensive, especially as the total population size increases and additional covariates are incorporated. We propose a composite method, the composite ILM (C-ILM), that clusters the population into minimally-interfered subpopulations, with between-cluster infections enabled through a “spark function.” This approach allows for parallel computation of subsets before aggregation. Focusing on C-ILM, we consider four “spark functions”, and introduce a Dirichlet process mixture modelling (DPMM) algorithm for clustering. Simulation results indicate that, in addition to faster computation, C-ILM performs well in parameter estimation and posterior predictions. Furthermore, within C-ILM framework, DPMM algorithm demonstrates superior performance compared to the conventional -means algorithm. We apply the methods to data from the 2001 UK foot-and-mouth disease outbreak. The results provide evidence that C-ILM is not only computationally efficient but also achieves a better model fit compared to the basic spatial ILM.
{"title":"Composite method for fast computation of individual level spatial epidemic models","authors":"Yirao Zhang , Rob Deardon , Lorna Deeth","doi":"10.1016/j.spasta.2026.100957","DOIUrl":"10.1016/j.spasta.2026.100957","url":null,"abstract":"<div><div>Individual-level models, also known as ILMs, are commonly used in epidemics modelling, as they can flexibly incorporate individual-level covariates that influence susceptibility and transmissibility upon infection. However, inference for ILMs is computationally intensive, especially as the total population size increases and additional covariates are incorporated. We propose a composite method, the composite ILM (C-ILM), that clusters the population into minimally-interfered subpopulations, with between-cluster infections enabled through a “spark function.” This approach allows for parallel computation of subsets before aggregation. Focusing on C-ILM, we consider four “spark functions”, and introduce a Dirichlet process mixture modelling (DPMM) algorithm for clustering. Simulation results indicate that, in addition to faster computation, C-ILM performs well in parameter estimation and posterior predictions. Furthermore, within C-ILM framework, DPMM algorithm demonstrates superior performance compared to the conventional <span><math><mi>K</mi></math></span>-means algorithm. We apply the methods to data from the 2001 UK foot-and-mouth disease outbreak. The results provide evidence that C-ILM is not only computationally efficient but also achieves a better model fit compared to the basic spatial ILM.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100957"},"PeriodicalIF":2.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.spasta.2026.100956
R.M. Di Biase , L. Fattorini , S. Franceschi , A. Marcelli , M. Marcheselli , C. Pisani
For the first time in ecological applications, the coverage of an attribute is estimated by line-strip sampling in which several strips of fixed width, running across the whole study area, are selected on a baseline and the coverage within these strips is recorded. Under line-strip sampling, the coverage can be expressed as the integral of the partial coverages within the strips, thus enabling its estimation through Monte Carlo integration methods, in which strips are randomly placed on the baseline according to uniform random sampling, tessellation stratified sampling, and systematic grid sampling. A simulation study based on real habitat maps of three coastal dune systems in the United Kingdom is conducted to assess the performance of these three integration strategies. Simulation results suggest tessellation stratified sampling to be the most suitable scheme to locate strips. Moreover, a case study on alien species coverage in a Mediterranean dune ecosystem in Italy is examined. Finally, the advantages of using line-strip sampling with respect to the use of familiar schemes as point sampling and line-intercept sampling are discussed.
{"title":"Reframing coverage estimation under line-strip sampling in the Monte Carlo integration framework","authors":"R.M. Di Biase , L. Fattorini , S. Franceschi , A. Marcelli , M. Marcheselli , C. Pisani","doi":"10.1016/j.spasta.2026.100956","DOIUrl":"10.1016/j.spasta.2026.100956","url":null,"abstract":"<div><div>For the first time in ecological applications, the coverage of an attribute is estimated by line-strip sampling in which several strips of fixed width, running across the whole study area, are selected on a baseline and the coverage within these strips is recorded. Under line-strip sampling, the coverage can be expressed as the integral of the partial coverages within the strips, thus enabling its estimation through Monte Carlo integration methods, in which strips are randomly placed on the baseline according to uniform random sampling, tessellation stratified sampling, and systematic grid sampling. A simulation study based on real habitat maps of three coastal dune systems in the United Kingdom is conducted to assess the performance of these three integration strategies. Simulation results suggest tessellation stratified sampling to be the most suitable scheme to locate strips. Moreover, a case study on alien species coverage in a Mediterranean dune ecosystem in Italy is examined. Finally, the advantages of using line-strip sampling with respect to the use of familiar schemes as point sampling and line-intercept sampling are discussed.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100956"},"PeriodicalIF":2.5,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.spasta.2026.100955
Yongqi Wang, Yunquan Song
Transfer learning is a machine learning approach that enhances target domain performance by leveraging knowledge from source domains. Although this method has been widely applied in regression problems, research remains limited for scenarios involving partially missing response data in the target domain. This study addresses the dual challenges of missing responses and small sample sizes in spatially dependent regression problems by proposing an EM algorithm-based transfer learning framework. The framework first employs the EM algorithm to handle missing responses in spatial autoregressive models, then develops a two-step transfer learning method for known source domains, along with a cross-validation-based detection algorithm for unknown transferable sources. Numerical simulations demonstrate that the proposed methods exhibit superior performance in both parameter estimation accuracy and model robustness.
{"title":"Transfer learning for spatial autoregressive models with missing responses","authors":"Yongqi Wang, Yunquan Song","doi":"10.1016/j.spasta.2026.100955","DOIUrl":"10.1016/j.spasta.2026.100955","url":null,"abstract":"<div><div>Transfer learning is a machine learning approach that enhances target domain performance by leveraging knowledge from source domains. Although this method has been widely applied in regression problems, research remains limited for scenarios involving partially missing response data in the target domain. This study addresses the dual challenges of missing responses and small sample sizes in spatially dependent regression problems by proposing an EM algorithm-based transfer learning framework. The framework first employs the EM algorithm to handle missing responses in spatial autoregressive models, then develops a two-step transfer learning method for known source domains, along with a cross-validation-based detection algorithm for unknown transferable sources. Numerical simulations demonstrate that the proposed methods exhibit superior performance in both parameter estimation accuracy and model robustness.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100955"},"PeriodicalIF":2.5,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growing accessibility of Light Detection and Ranging (LiDAR) data brings out novel perspectives that are crucial for tracking forest growth and enhancing resource management amid climate change. Utilizing these data to propose decision-support tools involves a vital step of segmenting individual trees. A widely adopted class of methods for this step is known as Local Maxima algorithms, which, although unsupervised, rely on per-site and/or per-species hyperparameter tuning for optimal performance. In this work, we introduce a novel methodological framework grounded in point process theory to jointly model the data generation process and provide formal implementation guidelines for refining window size selection within the class of Local Maxima algorithms. This methodology can also be applied to incomplete plot measurements, alleviating a constraint noted in most data acquisition procedures. To ensure the reproducibility of the results and validate the practical application, we apply the proposed methodology in two cases: (i) a simulated dataset (made publicly available) and (ii) an open real dataset. The simulation study evaluates performance under spatial configurations that do not necessarily follow the assumed point process model used for window calibration, thereby assessing robustness to model misspecification. The method outperforms the baseline approaches in the simulation study for the detection task, and achieves -scores between 55% and 90% on real data. On average, it improves upon the second-best method by about 4%, with performance depending on a tree’s position within the canopy relative to its neighbors.
{"title":"Adaptive local maxima windows for tree detection: A point process perspective","authors":"Konstantinos Florakis , Véronique Letort , Raphaël Canals , Gilles Faÿ , Samis Trevezas","doi":"10.1016/j.spasta.2026.100954","DOIUrl":"10.1016/j.spasta.2026.100954","url":null,"abstract":"<div><div>The growing accessibility of Light Detection and Ranging (LiDAR) data brings out novel perspectives that are crucial for tracking forest growth and enhancing resource management amid climate change. Utilizing these data to propose decision-support tools involves a vital step of segmenting individual trees. A widely adopted class of methods for this step is known as Local Maxima algorithms, which, although unsupervised, rely on per-site and/or per-species hyperparameter tuning for optimal performance. In this work, we introduce a novel methodological framework grounded in point process theory to jointly model the data generation process and provide formal implementation guidelines for refining window size selection within the class of Local Maxima algorithms. This methodology can also be applied to incomplete plot measurements, alleviating a constraint noted in most data acquisition procedures. To ensure the reproducibility of the results and validate the practical application, we apply the proposed methodology in two cases: (i) a simulated dataset (made publicly available) and (ii) an open real dataset. The simulation study evaluates performance under spatial configurations that do not necessarily follow the assumed point process model used for window calibration, thereby assessing robustness to model misspecification. The method outperforms the baseline approaches in the simulation study for the detection task, and achieves <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-scores between 55% and 90% on real data. On average, it improves upon the second-best method by about 4%, with performance depending on a tree’s position within the canopy relative to its neighbors.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100954"},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.spasta.2026.100953
Youhua Chen , Tsung-Jen Shen
A common practice in ecological and biodiversity research for estimating local species diversity levels is to integrate both a regional species abundance distribution model and a spatial distributional aggregation model of species. In this study, we argue that the inclusion of a species-specific spatial aggregation model is unnecessary in many cases because the regional species abundance distribution model can be directly transformed into a local species abundance distribution model to estimate local species richness and diversity levels. We support this claim by extensively investigating varying-scale species-area relation (SAR) patterns through a spatially explicit semi-empirical test on a fully censused forest plot, considering various spatial sampling scenarios. When local spatial sampling is randomly conducted with small or moderate operative sampling units (i.e., quadrats), estimated species richness closely matches theoretical expectations for the SAR curve (i.e., SAR rarefaction curve including both interpolation and extrapolation), as the corresponding confidence intervals consistently covered the true values. However, during the extrapolation process (i.e., spatially sample a local proportion of the forest plot and estimate species richness at a larger proportion of the plot), estimates sometimes tend to underestimate species richness when local spatial sampling was conducted using large quadrats or a single contiguous region, likely due to the effect of spatial autocorrelation. However, contiguous area sampling becomes challenging wen the single area covers natural barriers such as rivers or steep terrain in macro-ecological and spatial ecology research. By contrast, ecologists typically rely on information collected from many small-sized sampling plots for conducting biodiversity inference. To this end, in the field practice, local spatial sampling, or more specifically, the integration of spatial distributional aggregation model of species for biodiversity level estimation, was actually unnecessary in most cases. In conclusion, as long as ecologists can implement spatially random and unconstrained sampling, the two-step modeling approach is falsified, tending to create potentially misleading conclusions on diversity estimation and extinction risk assessments. Nonetheless, the local spatial aggregation model can still be helpful when large portions of the study region are inaccessible or when the local sampling cann't be conducted freely and randomly in space. A computational R package for estimating and plotting SAR with unconditional variance calculation is available at the following URL: https://zenodo.org/records/14821773.
{"title":"Estimating and plotting species-area relationship: Does aggregate distribution of species really matter?","authors":"Youhua Chen , Tsung-Jen Shen","doi":"10.1016/j.spasta.2026.100953","DOIUrl":"10.1016/j.spasta.2026.100953","url":null,"abstract":"<div><div>A common practice in ecological and biodiversity research for estimating local species diversity levels is to integrate both a regional species abundance distribution model and a spatial distributional aggregation model of species. In this study, we argue that the inclusion of a species-specific spatial aggregation model is unnecessary in many cases because the regional species abundance distribution model can be directly transformed into a local species abundance distribution model to estimate local species richness and diversity levels. We support this claim by extensively investigating varying-scale species-area relation (SAR) patterns through a spatially explicit semi-empirical test on a fully censused forest plot, considering various spatial sampling scenarios. When local spatial sampling is randomly conducted with small or moderate operative sampling units (i.e., quadrats), estimated species richness closely matches theoretical expectations for the SAR curve (i.e., SAR rarefaction curve including both interpolation and extrapolation), as the corresponding confidence intervals consistently covered the true values. However, during the extrapolation process (i.e., spatially sample a local proportion of the forest plot and estimate species richness at a larger proportion of the plot), estimates sometimes tend to underestimate species richness when local spatial sampling was conducted using large quadrats or a single contiguous region, likely due to the effect of spatial autocorrelation. However, contiguous area sampling becomes challenging wen the single area covers natural barriers such as rivers or steep terrain in macro-ecological and spatial ecology research. By contrast, ecologists typically rely on information collected from many small-sized sampling plots for conducting biodiversity inference. To this end, in the field practice, local spatial sampling, or more specifically, the integration of spatial distributional aggregation model of species for biodiversity level estimation, was actually unnecessary in most cases. In conclusion, as long as ecologists can implement spatially random and unconstrained sampling, the two-step modeling approach is falsified, tending to create potentially misleading conclusions on diversity estimation and extinction risk assessments. Nonetheless, the local spatial aggregation model can still be helpful when large portions of the study region are inaccessible or when the local sampling cann't be conducted freely and randomly in space. A computational R package for estimating and plotting SAR with unconditional variance calculation is available at the following URL: <span><span>https://zenodo.org/records/14821773</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100953"},"PeriodicalIF":2.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.spasta.2025.100952
Jong Hyeon Lee , Jongmin Kim , Heesang Lee , Jaewoo Park
A large class of spatial models contains intractable normalizing functions, such as spatial lattice models, interaction spatial point processes, and social network models. Bayesian inference for such models is challenging since the resulting posterior distribution is doubly intractable. Although auxiliary variable MCMC (AVM) algorithms are known to be the most practical, they are computationally expensive due to the repeated auxiliary variable simulations. To address this, we propose delayed-acceptance AVM (DA-AVM) methods, which can reduce the number of auxiliary variable simulations. The first stage of the kernel uses a cheap surrogate to decide whether to accept or reject the proposed parameter value. The second stage guarantees detailed balance with respect to the posterior. The auxiliary variable simulation is performed only on the parameters accepted in the first stage. We construct various surrogates specifically tailored for doubly intractable problems, including subsampling strategy, Gaussian process emulation, and frequentist estimator-based approximation. We validate our method through simulated and real data applications, demonstrating its practicality for complex spatial models.
{"title":"A delayed acceptance auxiliary variable MCMC for spatial models with intractable likelihood function","authors":"Jong Hyeon Lee , Jongmin Kim , Heesang Lee , Jaewoo Park","doi":"10.1016/j.spasta.2025.100952","DOIUrl":"10.1016/j.spasta.2025.100952","url":null,"abstract":"<div><div>A large class of spatial models contains intractable normalizing functions, such as spatial lattice models, interaction spatial point processes, and social network models. Bayesian inference for such models is challenging since the resulting posterior distribution is doubly intractable. Although auxiliary variable MCMC (AVM) algorithms are known to be the most practical, they are computationally expensive due to the repeated auxiliary variable simulations. To address this, we propose delayed-acceptance AVM (DA-AVM) methods, which can reduce the number of auxiliary variable simulations. The first stage of the kernel uses a cheap surrogate to decide whether to accept or reject the proposed parameter value. The second stage guarantees detailed balance with respect to the posterior. The auxiliary variable simulation is performed only on the parameters accepted in the first stage. We construct various surrogates specifically tailored for doubly intractable problems, including subsampling strategy, Gaussian process emulation, and frequentist estimator-based approximation. We validate our method through simulated and real data applications, demonstrating its practicality for complex spatial models.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100952"},"PeriodicalIF":2.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.spasta.2025.100951
Soyun Jeon , Jungsoon Choi
Elevated levels of PM10 are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM10 concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM10 concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM10 and environmental factors can further influence the variability in PM10. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM10 by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM10 data for South Korea in 2019, using PM2.5, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM10 exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.
{"title":"Copula-based spatio-temporal modeling of air pollutant data incorporating covariate dependence","authors":"Soyun Jeon , Jungsoon Choi","doi":"10.1016/j.spasta.2025.100951","DOIUrl":"10.1016/j.spasta.2025.100951","url":null,"abstract":"<div><div>Elevated levels of PM<sub>10</sub> are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM<sub>10</sub> concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM<sub>10</sub> concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM<sub>10</sub> and environmental factors can further influence the variability in PM<sub>10</sub>. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM<sub>10</sub> by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM<sub>10</sub> data for South Korea in 2019, using PM<sub>2.5</sub>, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM<sub>10</sub> exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100951"},"PeriodicalIF":2.5,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}