首页 > 最新文献

Journal of Agricultural Biological and Environmental Statistics最新文献

英文 中文
An Inhomogeneous Weibull–Hawkes Process to Model Underdispersed Acoustic Cues 用非均质 Weibull-Hawkes 过程模拟分散不足的声学线索
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-05-11 DOI: 10.1007/s13253-024-00626-w
Alec B. M. Van Helsdingen, Tiago A. Marques, Charlotte M. Jones-Todd

A Hawkes point process describes self-exciting behaviour where event arrivals are triggered by historic events. These models are increasingly becoming a popular choice in analysing event-type data. Like all other inhomogeneous Poisson point processes, the waiting time between events in a Hawkes process is derived from an exponential distribution with mean one. However, as with many ecological and environmental data, this is an unrealistic assumption. We, therefore, extend and generalise the Hawkes process to account for potential under- or overdispersion in the waiting times between events by assuming the Weibull distribution as the foundation of the waiting times. We apply this model to the acoustic cue production times of sperm whales and show that our Weibull–Hawkes model better captures the inherent underdispersion in the interarrival times of echolocation clicks emitted by these whales.

霍克斯点过程描述了事件到达由历史事件触发的自激行为。这些模型正日益成为分析事件类型数据的热门选择。与所有其他不均匀泊松点过程一样,霍克斯过程中事件之间的等待时间来自均值为 1 的指数分布。然而,与许多生态和环境数据一样,这是一个不切实际的假设。因此,我们对霍克斯过程进行了扩展和概括,通过假设韦布尔分布作为等待时间的基础,来考虑事件之间等待时间的潜在不足或过度分散。我们将这一模型应用于抹香鲸的声学线索产生时间,结果表明我们的 Weibull-Hawkes 模型能更好地捕捉抹香鲸发出的回声定位咔嗒声到达时间的内在低分散性。
{"title":"An Inhomogeneous Weibull–Hawkes Process to Model Underdispersed Acoustic Cues","authors":"Alec B. M. Van Helsdingen, Tiago A. Marques, Charlotte M. Jones-Todd","doi":"10.1007/s13253-024-00626-w","DOIUrl":"https://doi.org/10.1007/s13253-024-00626-w","url":null,"abstract":"<p>A Hawkes point process describes self-exciting behaviour where event arrivals are triggered by historic events. These models are increasingly becoming a popular choice in analysing event-type data. Like all other inhomogeneous Poisson point processes, the waiting time between events in a Hawkes process is derived from an exponential distribution with mean one. However, as with many ecological and environmental data, this is an unrealistic assumption. We, therefore, extend and generalise the Hawkes process to account for potential under- or overdispersion in the waiting times between events by assuming the Weibull distribution as the foundation of the waiting times. We apply this model to the acoustic cue production times of sperm whales and show that our Weibull–Hawkes model better captures the inherent underdispersion in the interarrival times of echolocation clicks emitted by these whales.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140931483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A class of models for large zero-inflated spatial data 大型零膨胀空间数据的一类模型
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-04-29 DOI: 10.1007/s13253-024-00619-9
Ben Seiyon Lee, Murali Haran

Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.

零过多的空间相关数据(通常称为零膨胀空间数据)出现在许多学科中。例如,动物物种的丰度(或缺乏丰度)和疾病计数等计数数据,以及观测到的降水等半连续数据。空间两部分模型是此类数据的一类灵活模型。由于高维依赖潜变量、昂贵的矩阵运算和缓慢的混合马尔可夫链,拟合两部分模型对于大型数据来说计算成本很高。我们介绍了一种灵活、计算高效的方法,利用基于投影的本征条件自回归(PICAR)框架对大型零膨胀空间数据进行建模。我们通过大量的模拟研究和两个环境数据集来研究我们的方法,我们称之为 PICAR-Z。我们的结果表明,PICAR-Z 既能提供准确的预测,又能保持计算效率。我们工作的一个重要目标是,让不擅长计算的研究人员也能轻松建立计算效率高的零膨胀空间模型扩展;这也使得在两部分模型中对建模选择进行更深入的探索成为可能。我们的研究表明,PICAR-Z 很容易在流行的概率编程语言(如 nimble 和 stan)中实现和扩展。
{"title":"A class of models for large zero-inflated spatial data","authors":"Ben Seiyon Lee, Murali Haran","doi":"10.1007/s13253-024-00619-9","DOIUrl":"https://doi.org/10.1007/s13253-024-00619-9","url":null,"abstract":"<p>Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as <span>nimble</span> and <span>stan</span>.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140808823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling First Arrival of Migratory Birds Using a Hierarchical Max-Infinitely Divisible Process 利用分层最大无限可分过程模拟候鸟的首次抵达
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-04-28 DOI: 10.1007/s13253-024-00624-y
Dhanushi A. Wijeyakulasuriya, Ephraim M. Hanks, Benjamin A. Shaby

Humans have recorded the arrival dates of migratory birds for millennia, searching for trends and patterns. As the first arrival among individuals in a species is the realized tail of the probability distribution of arrivals, the appropriate statistical framework with which to analyze such events is extreme value theory. Here, for the first time, we apply formal extreme value techniques to the dynamics of bird migrations. We study the annual first arrivals of Magnolia Warblers using modern tools from the statistical field of extreme value analysis. Using observations from the eBird database, we model the spatial distribution of observed Magnolia Warbler arrivals as a max-infinitely divisible process, which allows us to spatially interpolate observed annual arrivals in a probabilistically coherent way and to project arrival dynamics into the future by conditioning on climatic variables. Supplementary materials accompanying this paper appear online.

千百年来,人类一直在记录候鸟的到达日期,寻找候鸟迁徙的趋势和规律。由于一个物种中首次到达的个体是到达概率分布中已实现的尾部,因此分析此类事件的适当统计框架是极值理论。在这里,我们首次将正式的极值技术应用到鸟类迁徙的动态过程中。我们使用极值分析统计领域的现代工具研究了木兰莺每年的首次到达。利用 eBird 数据库中的观测数据,我们将观测到的木兰莺到达的空间分布建模为一个最大无限可分过程,这使我们能够以一种概率一致的方式对观测到的年度到达进行空间插值,并通过气候变量的条件对未来的到达动态进行预测。本文附带的补充材料可在线查阅。
{"title":"Modeling First Arrival of Migratory Birds Using a Hierarchical Max-Infinitely Divisible Process","authors":"Dhanushi A. Wijeyakulasuriya, Ephraim M. Hanks, Benjamin A. Shaby","doi":"10.1007/s13253-024-00624-y","DOIUrl":"https://doi.org/10.1007/s13253-024-00624-y","url":null,"abstract":"<p>Humans have recorded the arrival dates of migratory birds for millennia, searching for trends and patterns. As the first arrival among individuals in a species is the realized tail of the probability distribution of arrivals, the appropriate statistical framework with which to analyze such events is extreme value theory. Here, for the first time, we apply formal extreme value techniques to the dynamics of bird migrations. We study the annual first arrivals of Magnolia Warblers using modern tools from the statistical field of extreme value analysis. Using observations from the eBird database, we model the spatial distribution of observed Magnolia Warbler arrivals as a max-infinitely divisible process, which allows us to spatially interpolate observed annual arrivals in a probabilistically coherent way and to project arrival dynamics into the future by conditioning on climatic variables. Supplementary materials accompanying this paper appear online.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140808825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regularized Latent Trajectory Models for Spatio-temporal Population Dynamics 用于时空种群动力学的正则化潜在轨迹模型
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-04-01 DOI: 10.1007/s13253-024-00616-y
Xinyi Lu, Yoichiro Kanno, George P. Valentine, Matt A. Kulp, Mevin B. Hooten

Climate change impacts ecosystems variably in space and time. Landscape features may confer resistance against environmental stressors, whose intensity and frequency also depend on local weather patterns. Characterizing spatio-temporal variation in population responses to these stressors improves our understanding of what constitutes climate change refugia. We developed a Bayesian hierarchical framework that allowed us to differentiate population responses to seasonal weather patterns depending on their “sensitive” or “resilient” states. The framework inferred these sensitivity states based on latent trajectories delineating dynamic state probabilities. The latent trajectories are composed of linear initial conditions, functional regression models, and additive random effects representing ecological mechanisms such as topological buffering and effects of legacy weather conditions. Further, we developed a Bayesian regularization strategy that promoted temporal coherence in the inferred states. We demonstrated our hierarchical framework and regularization strategy using simulated examples and a case study of native brook trout (Salvelinus fontinalis) count data from the Great Smoky Mountains National Park, southeastern USA. Our study provided insights into ecological processes influencing brook trout sensitivity. Our framework can also be applied to other species and ecosystems to facilitate management and conservation.

气候变化在空间和时间上对生态系统的影响各不相同。地貌特征可能会带来对环境压力的抵抗力,而环境压力的强度和频率也取决于当地的天气模式。描述种群对这些压力因子的反应的时空变化,有助于我们更好地理解什么是气候变化避难所。我们开发了一个贝叶斯分层框架,使我们能够根据 "敏感 "或 "复原 "状态来区分种群对季节性天气模式的反应。该框架根据划定动态状态概率的潜在轨迹来推断这些敏感状态。潜在轨迹由线性初始条件、函数回归模型和代表生态机制(如拓扑缓冲和遗留天气条件的影响)的加法随机效应组成。此外,我们还开发了一种贝叶斯正则化策略,以促进推断状态的时间一致性。我们利用模拟实例和美国东南部大烟山国家公园的本地溪鳟(Salvelinus fontinalis)计数数据案例研究,展示了我们的分层框架和正则化策略。我们的研究为了解影响鳟鱼敏感性的生态过程提供了见解。我们的框架也可应用于其他物种和生态系统,以促进管理和保护。
{"title":"Regularized Latent Trajectory Models for Spatio-temporal Population Dynamics","authors":"Xinyi Lu, Yoichiro Kanno, George P. Valentine, Matt A. Kulp, Mevin B. Hooten","doi":"10.1007/s13253-024-00616-y","DOIUrl":"https://doi.org/10.1007/s13253-024-00616-y","url":null,"abstract":"<p>Climate change impacts ecosystems variably in space and time. Landscape features may confer resistance against environmental stressors, whose intensity and frequency also depend on local weather patterns. Characterizing spatio-temporal variation in population responses to these stressors improves our understanding of what constitutes climate change refugia. We developed a Bayesian hierarchical framework that allowed us to differentiate population responses to seasonal weather patterns depending on their “sensitive” or “resilient” states. The framework inferred these sensitivity states based on latent trajectories delineating dynamic state probabilities. The latent trajectories are composed of linear initial conditions, functional regression models, and additive random effects representing ecological mechanisms such as topological buffering and effects of legacy weather conditions. Further, we developed a Bayesian regularization strategy that promoted temporal coherence in the inferred states. We demonstrated our hierarchical framework and regularization strategy using simulated examples and a case study of native brook trout (<i>Salvelinus fontinalis</i>) count data from the Great Smoky Mountains National Park, southeastern USA. Our study provided insights into ecological processes influencing brook trout sensitivity. Our framework can also be applied to other species and ecosystems to facilitate management and conservation.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140596104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Models to Support Forest Inventory and Small Area Estimation Using Sparsely Sampled LiDAR: A Case Study Involving G-LiHT LiDAR in Tanana, Alaska 使用稀疏采样激光雷达支持森林资源清查和小面积估算的模型:阿拉斯加塔纳纳地区 G-LiHT 激光雷达案例研究
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-03-13 DOI: 10.1007/s13253-024-00611-3
Andrew O. Finley, Hans-Erik Andersen, Chad Babcock, Bruce D. Cook, Douglas C. Morton, Sudipto Banerjee

A two-stage hierarchical Bayesian model is developed and implemented to estimate forest biomass density and total given sparsely sampled LiDAR and georeferenced forest inventory plot measurements. The model is motivated by the United States Department of Agriculture (USDA) Forest Service Forest Inventory and Analysis (FIA) objective to provide biomass estimates for the remote Tanana Inventory Unit (TIU) in interior Alaska. The proposed model yields stratum-level biomass estimates for arbitrarily sized areas. Model-based estimates are compared with the TIU FIA design-based post-stratified estimates. Model-based small area estimates (SAEs) for two experimental forests within the TIU are compared with each forest’s design-based estimates generated using a dense network of independent inventory plots. Model parameter estimates and biomass predictions are informed using FIA plot measurements, LiDAR data that are spatially aligned with a subset of the FIA plots, and complete coverage remotely detected data used to define landuse/landcover stratum and percent forest canopy cover. Results support a model-based approach to estimating forest parameters when inventory data are sparse or resources limit collection of enough data to achieve desired accuracy and precision using design-based methods. Supplementary materials accompanying this paper appear on-line

本研究开发并实施了一个两阶段分层贝叶斯模型,用于估算稀疏采样的激光雷达和地理参照森林资源调查小区的森林生物量密度和总量。美国农业部 (USDA) 林业局森林资源调查与分析 (FIA) 的目标是为阿拉斯加内陆偏远的塔纳纳调查单元 (TIU) 提供生物量估算,而该模型正是基于此目标而开发的。建议的模型可对任意大小的区域进行分层生物量估算。基于模型的估算值与 TIU FIA 设计的分层后估算值进行了比较。对 TIU 内的两片实验林进行了基于模型的小面积估算(SAE),并将其与利用密集的独立清查地块网络生成的每片林的基于设计的估算进行了比较。模型参数估计和生物量预测使用了森林资源评估地块测量数据、与森林资源评估地块子集在空间上一致的激光雷达数据,以及用于定义土地利用/土地覆盖层和森林冠层覆盖率的完整覆盖遥感数据。研究结果支持采用基于模型的方法估算森林参数,当清查数据稀少或资源限制无法收集足够的数据时,采用基于设计的方法可达到理想的准确度和精确度。本文附带的补充材料可在线查阅
{"title":"Models to Support Forest Inventory and Small Area Estimation Using Sparsely Sampled LiDAR: A Case Study Involving G-LiHT LiDAR in Tanana, Alaska","authors":"Andrew O. Finley, Hans-Erik Andersen, Chad Babcock, Bruce D. Cook, Douglas C. Morton, Sudipto Banerjee","doi":"10.1007/s13253-024-00611-3","DOIUrl":"https://doi.org/10.1007/s13253-024-00611-3","url":null,"abstract":"<p>A two-stage hierarchical Bayesian model is developed and implemented to estimate forest biomass density and total given sparsely sampled LiDAR and georeferenced forest inventory plot measurements. The model is motivated by the United States Department of Agriculture (USDA) Forest Service Forest Inventory and Analysis (FIA) objective to provide biomass estimates for the remote Tanana Inventory Unit (TIU) in interior Alaska. The proposed model yields stratum-level biomass estimates for arbitrarily sized areas. Model-based estimates are compared with the TIU FIA design-based post-stratified estimates. Model-based small area estimates (SAEs) for two experimental forests within the TIU are compared with each forest’s design-based estimates generated using a dense network of independent inventory plots. Model parameter estimates and biomass predictions are informed using FIA plot measurements, LiDAR data that are spatially aligned with a subset of the FIA plots, and complete coverage remotely detected data used to define landuse/landcover stratum and percent forest canopy cover. Results support a model-based approach to estimating forest parameters when inventory data are sparse or resources limit collection of enough data to achieve desired accuracy and precision using design-based methods. Supplementary materials accompanying this paper appear on-line</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140148621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Efficacy of Statistical and Deep Learning Methods for Large Spatial Datasets: A Case Study 探索大型空间数据集的统计和深度学习方法的有效性:案例研究
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-02-08 DOI: 10.1007/s13253-024-00602-4
Arnab Hazra, Pratik Nag, Rishikesh Yadav, Ying Sun

Increasingly large and complex spatial datasets pose massive inferential challenges due to high computational and storage costs. Our study is motivated by the KAUST Competition on Large Spatial Datasets 2023, which tasked participants with estimating spatial covariance-related parameters and predicting values at testing sites, along with uncertainty estimates. We compared various statistical and deep learning approaches through cross-validation and ultimately selected the Vecchia approximation technique for model fitting. To overcome the constraints in the R package GpGp, which lacked support for fitting zero-mean Gaussian processes and direct uncertainty estimation—two things that are necessary for the competition, we developed additional R functions. Besides, we implemented certain subsampling-based approximations and parametric smoothing for skewed sampling distributions of the estimators. Our team DesiBoys secured the first position in two out of four sub-competitions and the second position in the other two, validating the effectiveness of our proposed strategies. Moreover, we extended our evaluation to a large real spatial satellite-derived dataset on total precipitable water, where we compared the predictive performances of different models using multiple diagnostics.

由于计算和存储成本高昂,日益庞大和复杂的空间数据集带来了巨大的推理挑战。我们的研究是受 KAUST 2023 年大型空间数据集竞赛的启发,该竞赛要求参赛者估算空间协方差相关参数并预测测试点的值以及不确定性估计值。我们通过交叉验证比较了各种统计和深度学习方法,最终选择了 Vecchia 近似技术进行模型拟合。R 软件包 GpGp 缺乏对零均值高斯过程拟合和直接不确定性估计的支持--而这两点正是比赛所必需的,为了克服这一限制,我们开发了额外的 R 函数。此外,我们还实现了某些基于子采样的近似和参数平滑,以处理估计器的倾斜采样分布。我们的团队 DesiBoys 在四项分赛中有两项获得第一名,另外两项获得第二名,这验证了我们提出的策略的有效性。此外,我们还将评估扩展到了一个大型真实空间卫星可降水总量数据集,并在此基础上使用多种诊断方法比较了不同模型的预测性能。
{"title":"Exploring the Efficacy of Statistical and Deep Learning Methods for Large Spatial Datasets: A Case Study","authors":"Arnab Hazra, Pratik Nag, Rishikesh Yadav, Ying Sun","doi":"10.1007/s13253-024-00602-4","DOIUrl":"https://doi.org/10.1007/s13253-024-00602-4","url":null,"abstract":"<p>Increasingly large and complex spatial datasets pose massive inferential challenges due to high computational and storage costs. Our study is motivated by the KAUST Competition on Large Spatial Datasets 2023, which tasked participants with estimating spatial covariance-related parameters and predicting values at testing sites, along with uncertainty estimates. We compared various statistical and deep learning approaches through cross-validation and ultimately selected the Vecchia approximation technique for model fitting. To overcome the constraints in the <span>R</span> package <span>GpGp</span>, which lacked support for fitting zero-mean Gaussian processes and direct uncertainty estimation—two things that are necessary for the competition, we developed additional <span>R</span> functions. Besides, we implemented certain subsampling-based approximations and parametric smoothing for skewed sampling distributions of the estimators. Our team DesiBoys secured the first position in two out of four sub-competitions and the second position in the other two, validating the effectiveness of our proposed strategies. Moreover, we extended our evaluation to a large real spatial satellite-derived dataset on total precipitable water, where we compared the predictive performances of different models using multiple diagnostics.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139759689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two Tests of Significance for Preferred Direction in Tree Radial Growth Under a Linear-Circular Regression Model with Correlated Random Errors 在具有相关随机误差的线性-圆形回归模型下,对树木径向生长首选方向的两个显著性检验
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-01-31 DOI: 10.1007/s13253-023-00599-2
Pierre Dutilleul, Tomoaki Imoto, Kunio Shimizu

To analyze tree growth statistically through annual ring widths measured in 2-D horizontal trunk sections, we propose two tests of significance defined under a linear-circular regression model with fixed trigonometric effects and normal random errors with a variance-covariance structure from the symmetric circulant family. The associated von Mises distribution has a preferred direction parameter. Accordingly, the first test aims to assess the presence of a preferred direction in the radial growth of a tree from the center of its trunk in a given year. Assuming there is a preferred direction of radial growth for the tree in two years, the second test extends the first one by assessing the equality of tree radial growth in the two preferred directions. Both tests of significance are modified F-tests with the denominator df adjusted for the presence of autocorrelation. Their validity is analyzed for two autoregressive symmetric circulant correlation structures, as a function of the number (n) of angular data and the autocorrelation parameter value. Effects of the inter-year correlation coefficient value are also studied in the two-year case. The performance of REstricted Maximum Likelihood as estimation method is scrutinized in an extensive Monte Carlo study, and the power of the tests is analyzed when valid. The new testing procedures are applied with (n = 32, 64) ring widths per year for a white spruce tree during 18 years of growth until its harvest. R codes are available. Conclusions and perspectives for future research are given. Supplementary materials accompanying this paper appear on-line.

为了通过二维水平树干截面测量的年轮宽度对树木生长进行统计分析,我们提出了两种显著性检验方法,其定义条件是线性圆回归模型具有固定的三角效应和正态随机误差,其方差-协方差结构属于对称环状族。相关的 von Mises 分布有一个优先方向参数。因此,第一个测试的目的是评估树木在某一年从树干中心开始的径向生长是否存在首选方向。假定树木在两年中的径向生长有一个首选方向,第二个检验扩展了第一个检验,评估树木在两个首选方向上的径向生长是否相等。这两个显著性检验都是修正的 F 检验,分母 df 根据自相关的存在进行了调整。针对两种自回归对称环状相关结构,分析了它们的有效性,作为角度数据数量(n)和自相关参数值的函数。在两年的情况下,还研究了年际相关系数值的影响。在广泛的蒙特卡罗研究中,对作为估计方法的限制最大似然法的性能进行了仔细检查,并分析了有效时的检验功率。新的测试程序在一棵白云杉 18 年的生长直至采伐期间,每年的环宽为(n = 32,64)。提供了 R 代码。文中给出了结论和对未来研究的展望。本文所附的补充材料可在线查阅。
{"title":"Two Tests of Significance for Preferred Direction in Tree Radial Growth Under a Linear-Circular Regression Model with Correlated Random Errors","authors":"Pierre Dutilleul, Tomoaki Imoto, Kunio Shimizu","doi":"10.1007/s13253-023-00599-2","DOIUrl":"https://doi.org/10.1007/s13253-023-00599-2","url":null,"abstract":"<p>To analyze tree growth statistically through annual ring widths measured in 2-D horizontal trunk sections, we propose two tests of significance defined under a linear-circular regression model with fixed trigonometric effects and normal random errors with a variance-covariance structure from the symmetric circulant family. The associated von Mises distribution has a preferred direction parameter. Accordingly, the first test aims to assess the presence of a preferred direction in the radial growth of a tree from the center of its trunk in a given year. Assuming there is a preferred direction of radial growth for the tree in two years, the second test extends the first one by assessing the equality of tree radial growth in the two preferred directions. Both tests of significance are modified <i>F</i>-tests with the denominator <i>df</i> adjusted for the presence of autocorrelation. Their validity is analyzed for two autoregressive symmetric circulant correlation structures, as a function of the number (<i>n</i>) of angular data and the autocorrelation parameter value. Effects of the inter-year correlation coefficient value are also studied in the two-year case. The performance of REstricted Maximum Likelihood as estimation method is scrutinized in an extensive Monte Carlo study, and the power of the tests is analyzed when valid. The new testing procedures are applied with <span>(n = 32, 64)</span> ring widths per year for a white spruce tree during 18 years of growth until its harvest. R codes are available. Conclusions and perspectives for future research are given. Supplementary materials accompanying this paper appear on-line.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139656375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Spatial Mixture Model for Spaceborne Lidar Observations Over Mixed Forest and Non-forest Land Types 混交林和非林地类型上空空载激光雷达观测的空间混合模型
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-01-30 DOI: 10.1007/s13253-024-00600-6
Paul B. May, Andrew O. Finley, Ralph O. Dubayah

The Global Ecosystem Dynamics Investigation (GEDI) is a spaceborne lidar instrument that collects near-global measurements of forest structure. While expansive in scope, GEDI samples are spatially sparse and cover a small fraction of the land surface. Converting the sparse samples into spatially complete predictive maps is of practical importance for a number of ecological studies. A complicating factor is that GEDI collects measurements over forested and non-forested land alike, with no automatic labeling of the land type. Such classification is important, as it categorically influences the probability distribution of the spatial process and the ecological interpretation of the observations/predictions. We propose and implement a spatial mixture model, separating the observations and the greater spatial domain into two latent classes. The latent classes are governed by a Bernoulli spatial process, with spatial effects driven by a Gaussian process. Within each class, the process is governed by a separate spatial model, describing the unique probabilistic attributes. Model predictions take the form of scalar predictions of the GEDI observables as well as discrete labeling of the class membership. Inference is conducted through a Bayesian paradigm, yielding rich quantification of prediction and uncertainty through posterior predictive distributions. We demonstrate the method using GEDI data over Wollemi National Park, Australia, using optical data from Landsat 8 as model covariates. When compared to a single spatial model, the mixture model achieves much higher posterior predictive densities on the true value. When compared to a random forest model, a common algorithmic approach in the remote sensing community, the random forest achieves better absolute prediction accuracy for prediction locations far from observed training data locations, but at the expense of location-specific assessments of uncertainty. The unsupervised binary classifications of the mixture model appear broadly ecologically interpretable as forest and non-forest when compared to optical imagery, but further comparison to ground-truth data is required.

全球生态系统动态调查(GEDI)是一种空间激光雷达仪器,用于收集近全球范围内的森林结构测量数据。GEDI 的样本虽然范围广泛,但空间稀疏,只覆盖陆地表面的一小部分。将稀疏的样本转换成空间上完整的预测地图对许多生态研究具有实际意义。一个复杂的因素是,GEDI 对林地和非林地都进行了测量,但没有自动标记土地类型。这种分类非常重要,因为它会对空间过程的概率分布和观测/预测的生态解释产生分类影响。我们提出并实施了一种空间混合模型,将观测数据和更大的空间领域分为两个潜在类别。潜类由伯努利空间过程控制,空间效应由高斯过程驱动。在每个类别中,该过程由单独的空间模型控制,描述独特的概率属性。模型预测的形式包括对 GEDI 可观测变量的标量预测以及对类别成员资格的离散标记。推理通过贝叶斯模式进行,通过后验预测分布对预测和不确定性进行丰富的量化。我们使用澳大利亚沃勒米国家公园的 GEDI 数据演示了该方法,并将 Landsat 8 的光学数据作为模型协变量。与单一空间模型相比,混合模型对真实值的后验预测密度要高得多。随机森林模型是遥感界常用的算法方法,与随机森林模型相比,随机森林模型对远离观测训练数据位置的预测位置的绝对预测精度更高,但却牺牲了对特定位置的不确定性评估。与光学图像相比,混合模型的无监督二元分类在生态学上可大致解释为森林和非森林,但还需要与地面实况数据作进一步比较。
{"title":"A Spatial Mixture Model for Spaceborne Lidar Observations Over Mixed Forest and Non-forest Land Types","authors":"Paul B. May, Andrew O. Finley, Ralph O. Dubayah","doi":"10.1007/s13253-024-00600-6","DOIUrl":"https://doi.org/10.1007/s13253-024-00600-6","url":null,"abstract":"<p>The Global Ecosystem Dynamics Investigation (GEDI) is a spaceborne lidar instrument that collects near-global measurements of forest structure. While expansive in scope, GEDI samples are spatially sparse and cover a small fraction of the land surface. Converting the sparse samples into spatially complete predictive maps is of practical importance for a number of ecological studies. A complicating factor is that GEDI collects measurements over forested and non-forested land alike, with no automatic labeling of the land type. Such classification is important, as it categorically influences the probability distribution of the spatial process and the ecological interpretation of the observations/predictions. We propose and implement a spatial mixture model, separating the observations and the greater spatial domain into two latent classes. The latent classes are governed by a Bernoulli spatial process, with spatial effects driven by a Gaussian process. Within each class, the process is governed by a separate spatial model, describing the unique probabilistic attributes. Model predictions take the form of scalar predictions of the GEDI observables as well as discrete labeling of the class membership. Inference is conducted through a Bayesian paradigm, yielding rich quantification of prediction and uncertainty through posterior predictive distributions. We demonstrate the method using GEDI data over Wollemi National Park, Australia, using optical data from Landsat 8 as model covariates. When compared to a single spatial model, the mixture model achieves much higher posterior predictive densities on the true value. When compared to a random forest model, a common algorithmic approach in the remote sensing community, the random forest achieves better absolute prediction accuracy for prediction locations far from observed training data locations, but at the expense of location-specific assessments of uncertainty. The unsupervised binary classifications of the mixture model appear broadly ecologically interpretable as forest and non-forest when compared to optical imagery, but further comparison to ground-truth data is required.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139649379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering and Geodesic Scaling of Dissimilarities on the Spherical Surface 球面上异质性的聚类和大地缩放
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-01-30 DOI: 10.1007/s13253-023-00597-4

Abstract

Spherical embedding is an important tool in several fields of data analysis, including environmental data, spatial statistics, text mining, gene expression analysis, medical research and, in general, areas in which the geodesic distance is a relevant factor. Many data acquisition technologies are related to massive data acquisition, and these high-dimensional vectors are often normalised and transformed into spherical data. In this representation of data on spherical surfaces, multidimensional scaling plays an important role. Traditionally, the methods of clustering and representation have been combined, since the precision of the representation tends to decrease when a large number of objects are involved, which makes interpretation difficult. In this paper, we present a model that partitions objects into classes while simultaneously representing the cluster centres on a spherical surface based on geodesic distances. The model combines a partition algorithm based on the approximation of dissimilarities to geodesic distances with a representation procedure for geodesic distances. In this process, the dissimilarities are transformed in order to optimise the radius of the sphere. The efficiency of the procedure described is analysed by means of an extensive Monte Carlo experiment, and its usefulness is illustrated for real data sets. Supplementary material to this paper is provided online.

摘要 球形嵌入是多个数据分析领域的重要工具,包括环境数据、空间统计、文本挖掘、基因表达分析、医学研究以及一般以大地距离为相关因素的领域。许多数据采集技术都与海量数据采集有关,这些高维矢量通常会被归一化并转换成球形数据。在球面数据的表示中,多维缩放起着重要作用。传统上,聚类和表示的方法是结合在一起的,因为当涉及大量对象时,表示的精度往往会降低,从而给解释带来困难。在本文中,我们提出了一种将物体划分为不同类别的模型,同时根据大地距离在球面上表示聚类中心。该模型结合了基于相似度与大地测量距离近似的划分算法和大地测量距离的表示程序。在这一过程中,为了优化球面的半径,对相似度进行了转换。本文通过大量蒙特卡罗实验分析了所述程序的效率,并在实际数据集上说明了该程序的实用性。本文的补充材料可在线查阅。
{"title":"Clustering and Geodesic Scaling of Dissimilarities on the Spherical Surface","authors":"","doi":"10.1007/s13253-023-00597-4","DOIUrl":"https://doi.org/10.1007/s13253-023-00597-4","url":null,"abstract":"<h3>Abstract</h3> <p>Spherical embedding is an important tool in several fields of data analysis, including environmental data, spatial statistics, text mining, gene expression analysis, medical research and, in general, areas in which the geodesic distance is a relevant factor. Many data acquisition technologies are related to massive data acquisition, and these high-dimensional vectors are often normalised and transformed into spherical data. In this representation of data on spherical surfaces, multidimensional scaling plays an important role. Traditionally, the methods of clustering and representation have been combined, since the precision of the representation tends to decrease when a large number of objects are involved, which makes interpretation difficult. In this paper, we present a model that partitions objects into classes while simultaneously representing the cluster centres on a spherical surface based on geodesic distances. The model combines a partition algorithm based on the approximation of dissimilarities to geodesic distances with a representation procedure for geodesic distances. In this process, the dissimilarities are transformed in order to optimise the radius of the sphere. The efficiency of the procedure described is analysed by means of an extensive Monte Carlo experiment, and its usefulness is illustrated for real data sets. Supplementary material to this paper is provided online.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139649550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Animal Density Estimation for Large Unmarked Populations Using a Spatially Explicit Model 利用空间显式模型估算大型无标记种群的动物密度
IF 1.4 4区 数学 Q1 Mathematics Pub Date : 2024-01-24 DOI: 10.1007/s13253-023-00598-3
Riki Herliansyah, Ruth King, Dede Aulia Rahman, Stuart King

Obtaining abundance and density estimates is a particularly important aspect within wildlife conservation and management. To monitor wildlife populations, the use of motion-sensor camera traps is becoming increasing popular due to its non-invasive nature. However, animal identification is not always feasible in practice due to poor quality images and/or individuals not having uniquely identifiable physical characteristics. Spatially explicit models for unmarked individuals permit the estimation of animal density when individuals cannot be uniquely identified. Due to the structure of these models, a Bayesian super-population (data augmentation) approach is often used to fit the models to data, which involves specifying some reasonably large upper limit for the population. However, this approach presents substantial computational challenges for larger populations, as demonstrated by the motivating dataset relating to barking deer (Muntiacus muntjak) collected in Ujung Kulon National Park, Indonesia (with a population size in the low thousands). We develop a new and computationally efficient Bayesian algorithm for fitting the models to data that does not require specifying an upper population limit a priori. We apply the new algorithm to the large barking deer dataset, where the standard super-population approach is computationally expensive, and demonstrate a substantial improvement in computational efficiency.Supplementary material to this paper is provided online.

在野生动物保护和管理中,获取丰度和密度估计值是一个特别重要的方面。为监测野生动物种群,使用运动传感器相机陷阱因其非侵入性而越来越受欢迎。然而,由于图像质量不佳和/或个体不具备唯一可识别的物理特征,在实践中识别动物并不总是可行的。无标记个体的空间显式模型可以在无法唯一识别个体的情况下估算动物密度。由于这些模型的结构,通常采用贝叶斯超种群(数据增强)方法将模型与数据拟合,其中包括指定一些合理的大种群上限。然而,这种方法对较大种群的计算带来了巨大的挑战,在印度尼西亚乌戎库隆国家公园收集的吠鹿(Muntiacus muntjak)数据集(种群数量少则数千)就证明了这一点。我们开发了一种新的、计算效率高的贝叶斯算法,用于将模型拟合到数据中,而无需事先指定种群上限。我们将新算法应用于大型吠鹿数据集,在该数据集中,标准超种群方法的计算成本很高,我们证明了计算效率的大幅提高。
{"title":"Animal Density Estimation for Large Unmarked Populations Using a Spatially Explicit Model","authors":"Riki Herliansyah, Ruth King, Dede Aulia Rahman, Stuart King","doi":"10.1007/s13253-023-00598-3","DOIUrl":"https://doi.org/10.1007/s13253-023-00598-3","url":null,"abstract":"<p>Obtaining abundance and density estimates is a particularly important aspect within wildlife conservation and management. To monitor wildlife populations, the use of motion-sensor camera traps is becoming increasing popular due to its non-invasive nature. However, animal identification is not always feasible in practice due to poor quality images and/or individuals not having uniquely identifiable physical characteristics. Spatially explicit models for unmarked individuals permit the estimation of animal density when individuals cannot be uniquely identified. Due to the structure of these models, a Bayesian super-population (data augmentation) approach is often used to fit the models to data, which involves specifying some reasonably large upper limit for the population. However, this approach presents substantial computational challenges for larger populations, as demonstrated by the motivating dataset relating to barking deer (<i>Muntiacus muntjak</i>) collected in Ujung Kulon National Park, Indonesia (with a population size in the low thousands). We develop a new and computationally efficient Bayesian algorithm for fitting the models to data that does not require specifying an upper population limit <i>a priori</i>. We apply the new algorithm to the large barking deer dataset, where the standard super-population approach is computationally expensive, and demonstrate a substantial improvement in computational efficiency.Supplementary material to this paper is provided online.</p>","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Agricultural Biological and Environmental Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1