Spatial Statistics最新文献_第2页

Copula-based spatio-temporal modeling of air pollutant data incorporating covariate dependence 结合协变量相关性的基于copula的空气污染物数据时空建模

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-12-30 DOI: 10.1016/j.spasta.2025.100951

Soyun Jeon , Jungsoon Choi

Elevated levels of PM₁₀ are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM₁₀ concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM₁₀ concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM₁₀ and environmental factors can further influence the variability in PM₁₀. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM₁₀ by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM₁₀ data for South Korea in 2019, using PM_2.5, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM₁₀ exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.

已知PM10水平升高会导致严重的呼吸系统和心血管疾病，在极端情况下还会导致癌症和死亡。尽管不同部门实施了各种减少政策，但韩国的PM10浓度继续超过世界卫生组织设定的年度建议限值。时空PM10浓度可能同时表现出时空依赖性。此外，PM10与环境因素之间的相互作用可以进一步影响PM10的变异性。因此，本研究提出了一种将协变量的时空邻居与PM10的时空邻居结合起来的方法，该方法采用一种通过时空邻居捕获时空相互作用的方法。Vine copula用于整合给定位置与其周围时空邻居之间的成对依赖结构。我们将该模型应用于2019年韩国每周平均PM10数据，使用PM2.5、CO、人口密度、夜间光照强度、土地利用组合和气温作为协变量。由于PM10呈现偏态，其边际分布采用Gumbel分布和广义极值分布建模。该模型通过有效捕获对极值分析至关重要的尾部依赖性，在预测极值的前5%方面，优于时空混合效应模型、克里格方法和其他基于copula的方法。本研究强调了利用藤联结有效地模拟时空数据中不同依赖结构的重要性，同时适应空间和时间维度，包括协变量之间的时空依赖性。结果强调了所提出的方法在存在复杂依赖结构的其他领域的更广泛的适用性。

{"title":"Copula-based spatio-temporal modeling of air pollutant data incorporating covariate dependence","authors":"Soyun Jeon , Jungsoon Choi","doi":"10.1016/j.spasta.2025.100951","DOIUrl":"10.1016/j.spasta.2025.100951","url":null,"abstract":"<div><div>Elevated levels of PM<sub>10</sub> are known to cause severe respiratory and cardiovascular diseases, and, in extreme cases, cancer and mortality. Despite various reduction policies implemented across different sectors, PM<sub>10</sub> concentrations in South Korea continue to exceed the annual recommended limit set by the World Health Organization. Spatio-temporal PM<sub>10</sub> concentrations may exhibit both spatial and temporal dependence. Additionally, interactions between PM<sub>10</sub> and environmental factors can further influence the variability in PM<sub>10</sub>. Therefore, this study proposes a method that incorporates the spatio-temporal neighbors of covariates alongside those of PM<sub>10</sub> by adopting an approach that captures spatio-temporal interactions through spatio-temporal neighbors. Vine copula was used to integrate pairwise dependence structures between a given location and its surrounding spatio-temporal neighbors. We applied the model to weekly average PM<sub>10</sub> data for South Korea in 2019, using PM<sub>2.5</sub>, CO, population density, nighttime light intensity, land-use mix and air temperature as covariates. As PM<sub>10</sub> exhibited skewness, its marginal distribution was modeled using the Gumbel and Generalized Extreme Value distributions. The proposed model outperformed a spatio-temporal mixed effects model, a kriging method, and alternative copula-based approaches, particularly in predicting the top 5% of extreme values, by effectively capturing tail dependence crucial for extreme value analysis. This study highlights the importance of utilizing vine copula to effectively model diverse dependence structures in spatio-temporal data while simultaneously accommodating spatial and temporal dimensions, including spatio-temporal dependence among covariates. The results underscore the broader applicability of the proposed approach to other fields where complex dependence structures are present.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"72 ","pages":"Article 100951"},"PeriodicalIF":2.5,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145895833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Determinants of vote transitions by ecological inference within small areas 小范围内生态推断的投票转换决定因素

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-12-16 DOI: 10.1016/j.spasta.2025.100950

Bruno Bracalente , Antonio Forcina , e Nicola Falocci

Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.

对投票转换驱动因素的实证分析很少，通常是在国家层面进行的，而且由于召回调查数据的不准确性，往往不可靠。为了克服缺乏足够的个人调查数据的问题，并纳入当地因素日益相关的作用，我们提出了一种生态推理方法来估计小同质区域内的投票转换计数，并通过多项逻辑模型评估其与当地特征的关系。这种方法允许对跨来源国和目的地的投票转换背后的上下文因素进行分类分析。我们将这种方法应用于意大利翁布里亚地区，该地区分为19个小区域。为了解释向赢得2022年大选的右翼民族主义政党和日益增加的弃权主义转变的数量，我们重点研究了当地社区在地理、经济和文化方面的劣势。在主要发现中，经济劣势主要促使之前的弃权者和极右翼的联盟党选民改变选择，支持正在崛起的右翼政党，而相反政治阵营的转变主要受到文化因素的影响，如缺乏社会资本、对欧盟的负面态度和政治传统。

{"title":"Determinants of vote transitions by ecological inference within small areas","authors":"Bruno Bracalente , Antonio Forcina , e Nicola Falocci","doi":"10.1016/j.spasta.2025.100950","DOIUrl":"10.1016/j.spasta.2025.100950","url":null,"abstract":"<div><div>Empirical analyses on the factors driving vote switching are rare, usually conducted at the national level and often unreliable due to the inaccuracy of recall survey data. To overcome the problem of lack of adequate individual survey data, and to incorporate the increasingly relevant role of local factors, we propose an ecological inference methodology to estimate the counts of vote transitions within small homogeneous areas and to assess their relationships with local characteristics through multinomial logistic models. This approach allows for a disaggregate analysis of contextual factors behind vote switching both across origins and destinations. We apply this methodology to the Italian region of Umbria, divided into 19 small areas. To explain the number of transitions toward the right-wing nationalist party that won the 2022 general elections and towards increasing abstentionism, we focused on measures of geographical, economic, and cultural disadvantages of local communities. Among the main findings, the economic disadvantages mainly pushed previous abstainers and far-right Lega voters to change their choices in favor of the rising right-wing party, while transitions from the opposite political camp were mostly influenced by cultural factors such as a lack of social capital, negative attitude towards the EU, and political tradition.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100950"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Voronoi linkage between mismatching voting stations and census tracts in analyzing the 2018 Brazilian presidential election data 在分析2018年巴西总统选举数据时，不匹配的投票站和人口普查区之间的Voronoi联系

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-12-16 DOI: 10.1016/j.spasta.2025.100949

Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan

In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package smile, facilitating reproducible spatial analyses under misalignment.

在巴西，社会经济数据可在人口普查区获得（多边形），而选举数据可在投票地点获得（点参照）。不一致的数据使得研究选举结果与社会经济变量之间的关系具有挑战性。由于选民在最近的投票站投票，我们使用Voronoi镶嵌将每个投票站与Voronoi多边形关联起来。每个多边形的社会经济变量然后从普查区级别的这些数据构建，假设两组面数据都是从相同的底层高斯场（GF）构建的。对Voronoi细胞的预测是根据潜在的GF和估计的参数得出的。由于社会经济变量不是正态分布，我们还考虑了一种非参数方法，即使用空间面插值从人口普查区数据中构建Voronoi细胞的数据。内插输出用作基线。仿真研究表明，该方法在模型不规范的情况下具有较好的鲁棒性。2018年在贝洛奥里藏特举行的巴西总统选举中，社会经济水平越低的地区，无效选票的比例就越高。所提出的方法在R包smile中实现，便于在不对准情况下进行可重复的空间分析。

{"title":"Voronoi linkage between mismatching voting stations and census tracts in analyzing the 2018 Brazilian presidential election data","authors":"Lucas da Cunha Godoy , Marcos Oliveira Prates , Jun Yan","doi":"10.1016/j.spasta.2025.100949","DOIUrl":"10.1016/j.spasta.2025.100949","url":null,"abstract":"<div><div>In Brazil, socioeconomic data are available at census tracts (polygons), while election data are available at voting locations (point-referenced). The misaligned data makes studying the association between election outcomes and socioeconomic variables challenging. Since voters vote at the nearest voting stations, we use a Voronoi tessellation to associate each voting station with a Voronoi polygon. Socioeconomic variables for each polygon are then constructed from such data at the census tract level, assuming that both sets of areal data were constructed from the same underlying Gaussian field (GF). Predictions for the Voronoi cells are derived from the underlying GF with estimated parameters. Since the socioeconomic variables are not normally distributed, we also consider a nonparametric approach that uses spatial areal interpolation to construct data for the Voronoi cells from the census tract data. The interpolated outputs are used as a baseline. Our simulation study shows that the method based on an underlying GF is robust in prediction under model misspecification. In application to the 2018 Brazilian presidential election in Belo Horizonte, more socioeconomically deprived regions were found to have a higher percentage of null votes. The proposed methods are implemented in the R package <span>smile</span>, facilitating reproducible spatial analyses under misalignment.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100949"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A heavy-tailed model for multivariate spatial processes 多元空间过程的重尾模型

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-12-10 DOI: 10.1016/j.spasta.2025.100948

Paritosh Kumar Roy , Alexandra M. Schmidt

Environmental data commonly involves measuring multiple pollutants, such as

{NO}_{2}

and

{PM}_{10}

levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software Stan. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of

{NO}_{2}

and

{PM}_{10}

on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.

环境数据通常包括在一个地区的一些固定地点测量多种污染物，如二氧化氮和PM10水平。数据分析师的目标是描述跨空间和污染物之间协方差的过程，通常假设一个具有平稳协方差函数的多元空间高斯模型。然而，观测到的数据分布往往表现出异质变异性，导致比高斯分布更重的尾部。为了通过避免数据转换来解决这些挑战，我们提出了一个灵活的多元空间模型，该模型具有空间变化的协变量相关方差，可以自然地适应重尾分布。具体来说，我们通过对过程的方差进行建模来扩展共区域化的线性模型，允许它们在空间上和依赖于协变量而变化。我们讨论了所提出的模型的性质，并概述了使用Stan软件实现的贝叶斯推理过程。由于该模型涉及多个高斯过程分量，我们进一步讨论了基于维奇亚的近似方法来分析大型空间数据集。人工数据分析表明，模型的参数是可识别的，并且可以准确地检测到存在的外围观测值，强调了模型的可靠性和鲁棒性。当将该模型应用于意大利382个地点一天中NO2和PM10的最大浓度时，该模型量化了不确定性，并比多元高斯模型更有效地捕获了局部结构。此外，所描述的近似方法在分析大型空间数据集方面显示出有效性。

{"title":"A heavy-tailed model for multivariate spatial processes","authors":"Paritosh Kumar Roy , Alexandra M. Schmidt","doi":"10.1016/j.spasta.2025.100948","DOIUrl":"10.1016/j.spasta.2025.100948","url":null,"abstract":"<div><div>Environmental data commonly involves measuring multiple pollutants, such as <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> levels, at some fixed sites across a region. Data analysts aim to describe the processes accounting for covariance across space and among pollutants, usually assuming a multivariate spatial Gaussian model with a stationary covariance function. However, the observed data distribution often exhibits heterogeneous variability, resulting in heavier tails than the Gaussian distribution. To address these challenges by avoiding data transformation, we propose a flexible multivariate spatial model with spatially varying covariate-dependent variance that naturally accommodates heavy-tailed distributions. Specifically, we extend the linear model of coregionalization by modeling the variances of the processes, allowing them to vary across space and depending on covariates. We discuss the properties of the proposed model and outline a Bayesian inference procedure implemented using the software <span>Stan</span>. As the model involves several Gaussian process components, we further discuss Vecchia-based approximation methods for analyzing large spatial datasets. Artificial data analyses suggest that the model’s parameters are identifiable and can accurately detect outlying observations if they exist, underscoring the model’s reliability and robustness. The model quantifies uncertainty and captures local structures more effectively than the multivariate Gaussian model when applied to maximum concentrations of <span><math><msub><mrow><mtext>NO</mtext></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mtext>PM</mtext></mrow><mrow><mn>10</mn></mrow></msub></math></span> on a day at 382 sites across Italy. Further, the described approximation methods show effectiveness in analyzing large spatial datasets.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100948"},"PeriodicalIF":2.5,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145796616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A simultaneous system of dynamic spatial stochastic frontier models with dependent error components and inefficiency determinants 具有相关误差分量和低效决定因素的动态空间随机前沿模型的同步系统

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-29 DOI: 10.1016/j.spasta.2025.100947

S. Emili, F. Galli

In this paper, we develop a system of simultaneous stochastic frontier models with inefficiency determinants, spatio-temporal effects and correlated inefficiency as well as correlated random errors among frontiers. The dependence among the errors of the different equations can stem from either shocks external to the system, interrelated inefficiency mechanisms, or a combination of both. Estimation is performed using a copula-based quasi-maximum likelihood approach. Simulation results confirm the good finite sample properties of the proposed estimator. To demonstrate the effectiveness of the proposed model and estimation technique in empirical settings, we analyse the key role of some sustainability-related factors in determining the efficiency level of Italian cultural and creative sectors.

在本文中，我们建立了一个同时存在无效率决定因素、时空效应、相关无效率和相关随机误差的随机前沿模型系统。不同方程误差之间的依赖关系可能源于系统外部的冲击，相互关联的低效率机制，或两者的结合。使用基于copula的拟极大似然方法进行估计。仿真结果证实了该估计器具有良好的有限样本特性。为了证明所提出的模型和评估技术在经验设置中的有效性，我们分析了一些与可持续性相关的因素在确定意大利文化和创意部门效率水平方面的关键作用。

引用次数: 0

Joint modeling of line and point data on metric graphs 度量图上线和点数据的联合建模

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-21 DOI: 10.1016/j.spasta.2025.100946

Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen

Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages inlabru and MetricGraph, and new code for processing geographical line data to work with existing graph representations and fmesher methods for dealing with line support in inlabru on objects from MetricGraph. We fit the model to two datasets where we expect different spatial dependency and compare the results.

度量图是描述空间域的有用工具，如道路和河流网络，其中空间依赖关系沿着网络起作用。我们利用这种高斯随机场（GRFs）的最新发展，并考虑具有不同空间支持的观测的联合空间建模。受挪威特隆赫姆交通状态建模应用的启发，我们考虑了线参考数据和点参考数据。线参考数据可以用GRF在度量图上沿线段的积分来描述。通过应用程序启发的模拟研究，我们研究了估计参数和预测未观测位置所需的重复次数。前者通过偏倚和可变性进行评估，后者通过均方根误差（RMSE）、连续秩概率评分（crps）和覆盖率进行评估。将联合建模与将线参考观测视为点参考观测的简化方法进行了对比。结果表明，联合建模导致了强有力的改进。挪威特隆赫姆的应用程序结合了点参考的感应环路数据和线参考的公共交通数据。为了确保正速度，我们使用非线性链接函数，它需要线性预测器的非线性组合的积分。通过R包inlabru和MetricGraph的组合，以及处理地理线数据的新代码来处理现有的图形表示和fmesher方法来处理inlabru中对MetricGraph对象的线支持，这在计算上是可行的。我们将模型拟合到两个我们期望不同空间依赖性的数据集，并比较结果。

{"title":"Joint modeling of line and point data on metric graphs","authors":"Karina Lilleborge , Sara Martino , Geir-Arne Fuglstad , Finn Lindgren , Rikke Ingebrigtsen","doi":"10.1016/j.spasta.2025.100946","DOIUrl":"10.1016/j.spasta.2025.100946","url":null,"abstract":"<div><div>Metric graphs are useful tools for describing spatial domains like road and river networks, where spatial dependence act along the network. We take advantage of recent developments for such Gaussian Random Fields (GRFs), and consider joint spatial modeling of observations with different spatial supports. Motivated by an application to traffic state modeling in Trondheim, Norway, we consider line-referenced data, which can be described by an integral of the GRF along a line segment on the metric graph, and point-referenced data. Through a simulation study inspired by the application, we investigate the number of replicates that are needed to estimate parameters and to predict unobserved locations. The former is assessed using bias and variability, and the latter is assessed through root mean square error (RMSE), continuous rank probability scores (CRPSs), and coverage. Joint modeling is contrasted with a simplified approach that treat line-referenced observations as point-referenced observations. The results suggest joint modeling leads to strong improvements. The application to Trondheim, Norway, combines point-referenced induction loop data and line-referenced public transportation data. To ensure positive speeds, we use a non-linear link function, which requires integrals of non-linear combinations of the linear predictor. This is made computationally feasible by a combination of the R packages <span>inlabru</span> and <span>MetricGraph</span>, and new code for processing geographical line data to work with existing graph representations and <span>fmesher</span> methods for dealing with line support in <span>inlabru</span> on objects from <span>MetricGraph</span>. We fit the model to two datasets where we expect different spatial dependency and compare the results.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100946"},"PeriodicalIF":2.5,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explicit modeling of density dependence in spatial capture-recapture models 空间捕获-再捕获模型中密度依赖性的显式建模

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-21 DOI: 10.1016/j.spasta.2025.100945

Qing Zhao , Yunyi Shen

Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.

密度依赖发生在个体水平上，因此在很大程度上受生境条件空间局部异质性的影响。然而，密度依赖往往在人口水平上进行评估，导致在检测这种过程时遇到困难甚至存在争议。基于贝叶斯的个体模型，如空间捕获-再捕获（SCR）模型，为研究个体水平上的密度依赖性提供了机会，但这种方法仍有待发展和评估。在本研究中，我们开发了一个SCR模型，该模型通过个体水平上的密度依赖过程将栖息地使用与表观生存和招募联系起来。通过模拟，我们发现该模型可以很好地反映栖息地的使用情况，但往往低估了密度依赖对表观生存和补充的影响。这种低估的原因可能是由于当前模型在不使用环境协变量来告知这些位置的情况下识别未观察到的个体的位置的困难。如何准确估计未观测个体的位置，从而确定密度依赖关系，一直是空间统计学和统计生态学的一个具有挑战性的课题。

{"title":"Explicit modeling of density dependence in spatial capture-recapture models","authors":"Qing Zhao , Yunyi Shen","doi":"10.1016/j.spasta.2025.100945","DOIUrl":"10.1016/j.spasta.2025.100945","url":null,"abstract":"<div><div>Density dependence occurs at the individual level and thus is greatly influenced by spatial local heterogeneity in habitat conditions. However, density dependence is often evaluated at the population level, leading to difficulties or even controversies in detecting such a process. Bayesian individual-based models such as spatial capture-recapture (SCR) models provide opportunities to study density dependence at the individual level, but such an approach remains to be developed and evaluated. In this study, we developed a SCR model that links habitat use to apparent survival and recruitment through density dependent processes at the individual level. Using simulations, we found that the model can properly inform habitat use, but tends to underestimate the effect of density dependence on apparent survival and recruitment. The reason for such underestimations is likely due to the difficulties of the current model in identifying the locations of unobserved individuals without using environmental covariates to inform these locations. How to accurately estimate the locations of unobserved individuals, and thus density dependence, remains a challenging topic in spatial statistics and statistical ecology.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100945"},"PeriodicalIF":2.5,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145580233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multivariate spatio-temporal modelling for completing cancer registries and forecasting incidence 完成癌症登记和预测发病率的多变量时空建模

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-20 DOI: 10.1016/j.spasta.2025.100944

Garazi Retegui, Jaione Etxeberria, María Dolores Ugarte

Cancer data, particularly cancer incidence and mortality, are fundamental to understand the cancer burden, to set targets for cancer control and to evaluate the evolution of the implementation of a cancer control policy. However, the complexity of data collection, classification, validation and processing result in cancer incidence figures often lagging two to three years behind the calendar year. In response, national or regional population-based cancer registries (PBCRs) are increasingly interested in methods for forecasting cancer incidence. However, in many countries there is an additional difficulty in projecting cancer incidence as regional registries are usually not established in the same year and therefore cancer incidence data series between different regions of a country are not harmonized over time. This study addresses the challenge of forecasting cancer incidence with incomplete data at both regional and national levels. To achieve this, we propose the use of multivariate spatio-temporal shared component models that jointly model mortality data and available cancer incidence data. We evaluate the performance of these multivariate models using lung cancer incidence data and the corresponding number of deaths reported in England for the period 2001–2019. Model performance was assessed using different predictive measures to select the best model.

癌症数据，特别是癌症发病率和死亡率，对于了解癌症负担、制定癌症控制目标和评估癌症控制政策实施的演变至关重要。然而，由于数据收集、分类、验证和处理的复杂性，导致癌症发病率数据往往比日历年落后两到三年。因此，国家或地区基于人口的癌症登记处（pbcr）对预测癌症发病率的方法越来越感兴趣。然而，在许多国家，预测癌症发病率还有一个额外的困难，因为区域登记通常不是在同一年建立的，因此一个国家不同区域之间的癌症发病率数据系列没有随着时间的推移而协调一致。本研究解决了在区域和国家两级数据不完整的情况下预测癌症发病率的挑战。为了实现这一目标，我们建议使用多变量时空共享成分模型，联合建模死亡率数据和现有癌症发病率数据。我们使用2001-2019年期间英国报告的肺癌发病率数据和相应的死亡人数来评估这些多变量模型的性能。使用不同的预测指标评估模型性能，以选择最佳模型。

{"title":"Multivariate spatio-temporal modelling for completing cancer registries and forecasting incidence","authors":"Garazi Retegui, Jaione Etxeberria, María Dolores Ugarte","doi":"10.1016/j.spasta.2025.100944","DOIUrl":"10.1016/j.spasta.2025.100944","url":null,"abstract":"<div><div>Cancer data, particularly cancer incidence and mortality, are fundamental to understand the cancer burden, to set targets for cancer control and to evaluate the evolution of the implementation of a cancer control policy. However, the complexity of data collection, classification, validation and processing result in cancer incidence figures often lagging two to three years behind the calendar year. In response, national or regional population-based cancer registries (PBCRs) are increasingly interested in methods for forecasting cancer incidence. However, in many countries there is an additional difficulty in projecting cancer incidence as regional registries are usually not established in the same year and therefore cancer incidence data series between different regions of a country are not harmonized over time. This study addresses the challenge of forecasting cancer incidence with incomplete data at both regional and national levels. To achieve this, we propose the use of multivariate spatio-temporal shared component models that jointly model mortality data and available cancer incidence data. We evaluate the performance of these multivariate models using lung cancer incidence data and the corresponding number of deaths reported in England for the period 2001–2019. Model performance was assessed using different predictive measures to select the best model.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100944"},"PeriodicalIF":2.5,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Misspecification issues between competitive spatio-temporal cluster point processes 竞争时空聚类点过程之间的错配问题

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-15 DOI: 10.1016/j.spasta.2025.100940

Alba Bernabeu , Claudio Fronterrè , Jorge Mateu

Spatio-temporal point pattern data often exhibit clustering, which may arise either from self-exciting mechanisms or from latent environmental heterogeneity. Hawkes processes and log-Gaussian Cox processes (LGCPs) represent two widely used but fundamentally different modelling approaches for capturing such patterns. While Hawkes processes assume event-driven triggering, LGCPs model clustering through a latent Gaussian random field acting on the intensity function. In practice, model selection between these alternatives is rarely conducted rigorously, and second-order characteristics are often insufficient to discriminate between them. We present a simulation-based comparative study that systematically evaluates estimation, second-order structure, and predictive performance under model misspecification. In particular, we assess the ability of each model to reproduce the dynamics of data generated under the competing framework. We also analyse real crime data to illustrate how inference and prediction are affected by model choice. Our results underscore the interpretive consequences of model misspecification and highlight key diagnostic limitations when disentangling clustering mechanisms in spatio-temporal processes.

时空点型数据往往表现为聚类，这可能是由于自激机制或潜在的环境异质性造成的。Hawkes过程和log-Gaussian Cox过程（LGCPs）代表了两种广泛使用但本质上不同的建模方法来捕获这种模式。Hawkes过程假设事件驱动触发，LGCPs模型通过作用于强度函数的潜在高斯随机场进行聚类。在实践中，这些备选方案之间的模型选择很少严格进行，而且二阶特征往往不足以区分它们。我们提出了一项基于仿真的比较研究，系统地评估了模型错误规范下的估计、二阶结构和预测性能。特别是，我们评估了每个模型在竞争框架下再现数据动态的能力。我们还分析了真实的犯罪数据，以说明模型选择如何影响推理和预测。我们的研究结果强调了模型错误规范的解释后果，并强调了在时空过程中解开聚类机制时的关键诊断局限性。

{"title":"Misspecification issues between competitive spatio-temporal cluster point processes","authors":"Alba Bernabeu , Claudio Fronterrè , Jorge Mateu","doi":"10.1016/j.spasta.2025.100940","DOIUrl":"10.1016/j.spasta.2025.100940","url":null,"abstract":"<div><div>Spatio-temporal point pattern data often exhibit clustering, which may arise either from self-exciting mechanisms or from latent environmental heterogeneity. Hawkes processes and log-Gaussian Cox processes (LGCPs) represent two widely used but fundamentally different modelling approaches for capturing such patterns. While Hawkes processes assume event-driven triggering, LGCPs model clustering through a latent Gaussian random field acting on the intensity function. In practice, model selection between these alternatives is rarely conducted rigorously, and second-order characteristics are often insufficient to discriminate between them. We present a simulation-based comparative study that systematically evaluates estimation, second-order structure, and predictive performance under model misspecification. In particular, we assess the ability of each model to reproduce the dynamics of data generated under the competing framework. We also analyse real crime data to illustrate how inference and prediction are affected by model choice. Our results underscore the interpretive consequences of model misspecification and highlight key diagnostic limitations when disentangling clustering mechanisms in spatio-temporal processes.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100940"},"PeriodicalIF":2.5,"publicationDate":"2025-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A visual method for identifying local over-densities in spatial data 一种识别空间数据中局部过密度的可视化方法

IF 2.5 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY

Spatial Statistics

Pub Date : 2025-11-12 DOI: 10.1016/j.spasta.2025.100942

Yining Han , Xiaobin Chen , Xingxing Huang , Zhongyin Liu

The integration of multi-source data often results in the occurrence of local over-densities at point locations. These over-densities can bias the estimation of spatial statistical parameters, such as mean and variance, compromise the quality of variogram fitting, and degrade the accuracy of interpolation results. However, a standardized approach for identifying local over-densities in spatial datasets is currently lacking. In this paper, we propose three parameters—self-sparsity, mutual sparsity, and small-distance variability—and construct a three-parameter comprehensive cross-plot to facilitate the visual identification of local over-densities within spatial data points, thereby enabling further processing. Using both synthetic datasets generated via stochastic process simulations and real-world datasets, we demonstrate that the proposed three-parameter comprehensive cross-plot, based on self-sparsity, mutual sparsity, and small-distance variability, effectively identifies local over-densities in spatial datasets. Furthermore, by appropriately processing these over-densities, the accuracy of spatial statistical parameter estimation can be enhanced, a more reliable theoretical variogram model can be established, and both spatial statistical analysis and interpolation results can ultimately be improved.

多源数据的集成常常导致点位置出现局部过密度现象。这些过度密度会使空间统计参数（如均值和方差）的估计产生偏差，影响变异函数拟合的质量，并降低插值结果的准确性。然而，目前还缺乏一种标准化的方法来识别空间数据集中的局部过度密度。本文提出了自稀疏度、互稀疏度和小距离变率三个参数，并构建了三参数综合交叉图，以方便对空间数据点内局部过密度的视觉识别，从而便于进一步处理。利用随机过程模拟生成的合成数据集和实际数据集，我们证明了基于自稀疏性、互稀疏性和小距离变异性的三参数综合交叉图可以有效地识别空间数据集中的局部过密度。通过对这些过密度进行适当的处理，可以提高空间统计参数估计的精度，建立更可靠的理论变异函数模型，最终改善空间统计分析和插值结果。

{"title":"A visual method for identifying local over-densities in spatial data","authors":"Yining Han , Xiaobin Chen , Xingxing Huang , Zhongyin Liu","doi":"10.1016/j.spasta.2025.100942","DOIUrl":"10.1016/j.spasta.2025.100942","url":null,"abstract":"<div><div>The integration of multi-source data often results in the occurrence of local over-densities at point locations. These over-densities can bias the estimation of spatial statistical parameters, such as mean and variance, compromise the quality of variogram fitting, and degrade the accuracy of interpolation results. However, a standardized approach for identifying local over-densities in spatial datasets is currently lacking. In this paper, we propose three parameters—self-sparsity, mutual sparsity, and small-distance variability—and construct a three-parameter comprehensive cross-plot to facilitate the visual identification of local over-densities within spatial data points, thereby enabling further processing. Using both synthetic datasets generated via stochastic process simulations and real-world datasets, we demonstrate that the proposed three-parameter comprehensive cross-plot, based on self-sparsity, mutual sparsity, and small-distance variability, effectively identifies local over-densities in spatial datasets. Furthermore, by appropriately processing these over-densities, the accuracy of spatial statistical parameter estimation can be enhanced, a more reliable theoretical variogram model can be established, and both spatial statistical analysis and interpolation results can ultimately be improved.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"71 ","pages":"Article 100942"},"PeriodicalIF":2.5,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0