检测分类面积数据空间模式的假设检验

IF 2.1 2区数学 Q3 GEOSCIENCES, MULTIDISCIPLINARY Spatial Statistics Pub Date : 2024-05-04 DOI:10.1016/j.spasta.2024.100839

Stella Self , Xingpei Zhao , Anja Zgodic , Anna Overby , David White , Alexander C. McLain , Caitlin Dyckman

{"title":"检测分类面积数据空间模式的假设检验","authors":"Stella Self , Xingpei Zhao , Anja Zgodic , Anna Overby , David White , Alexander C. McLain , Caitlin Dyckman","doi":"10.1016/j.spasta.2024.100839","DOIUrl":null,"url":null,"abstract":"<div><p>The vast growth of spatial datasets in recent decades has fueled the development of many statistical methods for detecting spatial patterns. Two of the most commonly studied spatial patterns are clustering, loosely defined as datapoints with similar attributes existing close together, and dispersion, loosely defined as the semi-regular placement of datapoints with similar attributes. In this work, we develop a hypothesis test to detect spatial clustering or dispersion at specific distances in categorical areal data. Such data consists of a set of spatial regions whose boundaries are fixed and known (e.g., counties) associated with a categorical random variable (e.g. whether the county is rural, micropolitan, or metropolitan). We propose a method to extend the positive area proportion function (developed for detecting spatial clustering in binary areal data) to the categorical case. This proposal, referred to as the categorical positive areal proportion function test, can detect various spatial patterns, including homogeneous clusters, heterogeneous clusters, and dispersion. Our approach is the first method capable of distinguishing between different types of clustering in categorical areal data. After validating our method using an extensive simulation study, we use the categorical positive area proportion function test to detect spatial patterns in Boulder County, Colorado USA biological, agricultural, built and open conservation easements.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"61 ","pages":"Article 100839"},"PeriodicalIF":2.1000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A hypothesis test for detecting spatial patterns in categorical areal data\",\"authors\":\"Stella Self , Xingpei Zhao , Anja Zgodic , Anna Overby , David White , Alexander C. McLain , Caitlin Dyckman\",\"doi\":\"10.1016/j.spasta.2024.100839\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The vast growth of spatial datasets in recent decades has fueled the development of many statistical methods for detecting spatial patterns. Two of the most commonly studied spatial patterns are clustering, loosely defined as datapoints with similar attributes existing close together, and dispersion, loosely defined as the semi-regular placement of datapoints with similar attributes. In this work, we develop a hypothesis test to detect spatial clustering or dispersion at specific distances in categorical areal data. Such data consists of a set of spatial regions whose boundaries are fixed and known (e.g., counties) associated with a categorical random variable (e.g. whether the county is rural, micropolitan, or metropolitan). We propose a method to extend the positive area proportion function (developed for detecting spatial clustering in binary areal data) to the categorical case. This proposal, referred to as the categorical positive areal proportion function test, can detect various spatial patterns, including homogeneous clusters, heterogeneous clusters, and dispersion. Our approach is the first method capable of distinguishing between different types of clustering in categorical areal data. After validating our method using an extensive simulation study, we use the categorical positive area proportion function test to detect spatial patterns in Boulder County, Colorado USA biological, agricultural, built and open conservation easements.</p></div>\",\"PeriodicalId\":48771,\"journal\":{\"name\":\"Spatial Statistics\",\"volume\":\"61 \",\"pages\":\"Article 100839\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spatial Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2211675324000307\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial Statistics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675324000307","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

近几十年来，空间数据集的大量增加推动了许多用于检测空间模式的统计方法的发展。其中最常研究的两种空间模式是聚类和离散，前者宽泛地定义为具有相似属性的数据点紧靠在一起，后者宽泛地定义为具有相似属性的数据点的半规则分布。在这项工作中，我们开发了一种假设检验方法，用于检测分类区域数据中特定距离的空间聚类或分散。此类数据由一组边界固定且已知的空间区域（如县）组成，这些区域与一个分类随机变量（如县是农村、微型城市还是大都市）相关联。我们提出了一种将正面积比例函数（为检测二元面积数据中的空间聚类而开发）扩展到分类情况的方法。该方法被称为分类正面积比例函数检验法，可以检测出各种空间模式，包括同质聚类、异质聚类和离散模式。我们的方法是第一种能够区分分类方差数据中不同类型聚类的方法。在通过大量模拟研究验证了我们的方法后，我们使用分类正面积比例函数检验法检测了美国科罗拉多州博尔德县的生物、农业、建筑和开放式保护地役权的空间模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A hypothesis test for detecting spatial patterns in categorical areal data

The vast growth of spatial datasets in recent decades has fueled the development of many statistical methods for detecting spatial patterns. Two of the most commonly studied spatial patterns are clustering, loosely defined as datapoints with similar attributes existing close together, and dispersion, loosely defined as the semi-regular placement of datapoints with similar attributes. In this work, we develop a hypothesis test to detect spatial clustering or dispersion at specific distances in categorical areal data. Such data consists of a set of spatial regions whose boundaries are fixed and known (e.g., counties) associated with a categorical random variable (e.g. whether the county is rural, micropolitan, or metropolitan). We propose a method to extend the positive area proportion function (developed for detecting spatial clustering in binary areal data) to the categorical case. This proposal, referred to as the categorical positive areal proportion function test, can detect various spatial patterns, including homogeneous clusters, heterogeneous clusters, and dispersion. Our approach is the first method capable of distinguishing between different types of clustering in categorical areal data. After validating our method using an extensive simulation study, we use the categorical positive area proportion function test to detect spatial patterns in Boulder County, Colorado USA biological, agricultural, built and open conservation easements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Spatial Statistics GEOSCIENCES, MULTIDISCIPLINARY-MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

CiteScore

4.00

自引率

21.70%

发文量

审稿时长

55 days

期刊介绍： Spatial Statistics publishes articles on the theory and application of spatial and spatio-temporal statistics. It favours manuscripts that present theory generated by new applications, or in which new theory is applied to an important practical case. A purely theoretical study will only rarely be accepted. Pure case studies without methodological development are not acceptable for publication. Spatial statistics concerns the quantitative analysis of spatial and spatio-temporal data, including their statistical dependencies, accuracy and uncertainties. Methodology for spatial statistics is typically found in probability theory, stochastic modelling and mathematical statistics as well as in information science. Spatial statistics is used in mapping, assessing spatial data quality, sampling design optimisation, modelling of dependence structures, and drawing of valid inference from a limited set of spatio-temporal data.