大型零膨胀空间数据的一类模型

IF 1.4 4区数学 Q3 BIOLOGY Journal of Agricultural Biological and Environmental Statistics Pub Date : 2024-04-29 DOI:10.1007/s13253-024-00619-9

Ben Seiyon Lee, Murali Haran

{"title":"大型零膨胀空间数据的一类模型","authors":"Ben Seiyon Lee, Murali Haran","doi":"10.1007/s13253-024-00619-9","DOIUrl":null,"url":null,"abstract":"Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.","PeriodicalId":56336,"journal":{"name":"Journal of Agricultural Biological and Environmental Statistics","volume":"18 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A class of models for large zero-inflated spatial data\",\"authors\":\"Ben Seiyon Lee, Murali Haran\",\"doi\":\"10.1007/s13253-024-00619-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.\",\"PeriodicalId\":56336,\"journal\":{\"name\":\"Journal of Agricultural Biological and Environmental Statistics\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Agricultural Biological and Environmental Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s13253-024-00619-9\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Agricultural Biological and Environmental Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s13253-024-00619-9","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

零过多的空间相关数据（通常称为零膨胀空间数据）出现在许多学科中。例如，动物物种的丰度（或缺乏丰度）和疾病计数等计数数据，以及观测到的降水等半连续数据。空间两部分模型是此类数据的一类灵活模型。由于高维依赖潜变量、昂贵的矩阵运算和缓慢的混合马尔可夫链，拟合两部分模型对于大型数据来说计算成本很高。我们介绍了一种灵活、计算高效的方法，利用基于投影的本征条件自回归（PICAR）框架对大型零膨胀空间数据进行建模。我们通过大量的模拟研究和两个环境数据集来研究我们的方法，我们称之为 PICAR-Z。我们的结果表明，PICAR-Z 既能提供准确的预测，又能保持计算效率。我们工作的一个重要目标是，让不擅长计算的研究人员也能轻松建立计算效率高的零膨胀空间模型扩展；这也使得在两部分模型中对建模选择进行更深入的探索成为可能。我们的研究表明，PICAR-Z 很容易在流行的概率编程语言（如 nimble 和 stan）中实现和扩展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A class of models for large zero-inflated spatial data

Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Agricultural Biological and Environmental Statistics 生物-生物学

CiteScore

2.70

自引率

7.10%

发文量

审稿时长

>12 weeks

期刊介绍： The Journal of Agricultural, Biological and Environmental Statistics (JABES) publishes papers that introduce new statistical methods to solve practical problems in the agricultural sciences, the biological sciences (including biotechnology), and the environmental sciences (including those dealing with natural resources). Papers that apply existing methods in a novel context are also encouraged. Interdisciplinary papers and papers that illustrate the application of new and important statistical methods using real data are strongly encouraged. The journal does not normally publish papers that have a primary focus on human genetics, human health, or medical statistics.