A hypothesis test for detecting distance-specific clustering and dispersion in areal data

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2023-06-01 DOI:10.1016/j.spasta.2023.100757

Stella Self , Anna Overby , Anja Zgodic , David White , Alexander McLain , Caitlin Dyckman

{"title":"A hypothesis test for detecting distance-specific clustering and dispersion in areal data","authors":"Stella Self , Anna Overby , Anja Zgodic , David White , Alexander McLain , Caitlin Dyckman","doi":"10.1016/j.spasta.2023.100757","DOIUrl":null,"url":null,"abstract":"<div>Spatial clustering detection has a variety of applications in diverse fields, including identifying infectious disease outbreaks, pinpointing crime hotspots, and identifying clusters of neurons in brain imaging applications. Ripley’s K-function is a popular method for detecting clustering (or dispersion) in point process data at specific distances. Ripley’s K-function measures the expected number of points within a given distance of any observed point. Clustering can be assessed by comparing the observed value of Ripley’s K-function to the expected value under complete spatial randomness. While performing spatial clustering analysis on point process data is common, applications to areal data commonly arise and need to be accurately assessed. Inspired by Ripley’s K-function, we develop the positive area proportion function (PAPF) and use it to develop a hypothesis testing procedure for the detection of spatial clustering and dispersion at specific distances in areal data. We compare the performance of the proposed PAPF hypothesis test to that of the global Moran’s I statistic, the Getis–Ord general G statistic, and the spatial scan statistic with extensive simulation studies. We then evaluate the real-world performance of our method by using it to detect spatial clustering in land parcels containing conservation easements and US counties with high pediatric overweight/obesity rates.</div>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312012/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675323000325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}

引用次数: 0

Abstract

Spatial clustering detection has a variety of applications in diverse fields, including identifying infectious disease outbreaks, pinpointing crime hotspots, and identifying clusters of neurons in brain imaging applications. Ripley’s K-function is a popular method for detecting clustering (or dispersion) in point process data at specific distances. Ripley’s K-function measures the expected number of points within a given distance of any observed point. Clustering can be assessed by comparing the observed value of Ripley’s K-function to the expected value under complete spatial randomness. While performing spatial clustering analysis on point process data is common, applications to areal data commonly arise and need to be accurately assessed. Inspired by Ripley’s K-function, we develop the positive area proportion function (PAPF) and use it to develop a hypothesis testing procedure for the detection of spatial clustering and dispersion at specific distances in areal data. We compare the performance of the proposed PAPF hypothesis test to that of the global Moran’s I statistic, the Getis–Ord general G statistic, and the spatial scan statistic with extensive simulation studies. We then evaluate the real-world performance of our method by using it to detect spatial clustering in land parcels containing conservation easements and US counties with high pediatric overweight/obesity rates.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于检测区域数据中特定距离的聚类和离散的假设检验

空间聚类检测在不同领域有多种应用，包括识别传染病爆发、精确定位犯罪热点以及在脑成像应用中识别神经元簇。Ripley的K函数是检测特定距离点过程数据中的聚类（或分散）的常用方法。Ripley的K函数测量任何观测点在给定距离内的预期点数。聚类可以通过将Ripley的K函数的观测值与完全空间随机性下的期望值进行比较来评估。虽然对点过程数据执行空间聚类分析是常见的，但对区域数据的应用通常会出现，并且需要准确评估。受Ripley的K函数的启发，我们开发了正面积比例函数（PAPF），并用它来开发一个假设检验程序，用于检测区域数据中特定距离的空间聚类和分散。我们通过大量的模拟研究，将所提出的PAPF假设检验的性能与全局Moran’s I统计量、Getis–Ord广义G统计量和空间扫描统计量的性能进行了比较。然后，我们通过使用我们的方法来检测包含保护地役权的地块和美国儿童超重/肥胖率高的县的空间聚类，来评估我们的方法在现实世界中的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊