Alan Ricardo da Silva, Marcos Douglas Rodrigues de Sousa
{"title":"Geographically Weighted Zero-Inflated Negative Binomial Regression: A general case for count data","authors":"Alan Ricardo da Silva, Marcos Douglas Rodrigues de Sousa","doi":"10.1016/j.spasta.2023.100790","DOIUrl":null,"url":null,"abstract":"<div><p>Poisson and Negative Binomial Regression Models are often used to describe the relationship between a count dependent variable and a set of independent variables. However, these models fail to analyze data with an excess of zeros, being Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models the most appropriate to fit this kind of data. To Incorporate the spatial dimension into the count data models, Geographically Weighted Poisson Regression (GWPR), Geographically Weighted Negative Binomial Regression (GWNBR) and Geographically Weighted Zero-Inflated Poisson Regression (GWZIPR) have been developed, but the zero-inflation part of the negative binomial distribution is undeveloped in order to incorporate the overdispersion and the excess of zeros, as was at the beginning of the COVID-19 pandemic, whereas some places were having an outbreak of cases and in others places, there were no cases yet. Therefore, we propose a Geographically Weighted Zero-Inflated Negative Binomial Regression (GWZINBR) model which can be considered a general case for count data, since locally it can become a GWZIPR, GWNBR or a GWPR model. We applied this model to simulated data and to the cases of COVID-19 in South Korea at the beginning of the pandemic in 2020 and the results showed a better understanding of the phenomenon compared to the GWNBR model.</p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial Statistics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675323000659","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Poisson and Negative Binomial Regression Models are often used to describe the relationship between a count dependent variable and a set of independent variables. However, these models fail to analyze data with an excess of zeros, being Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models the most appropriate to fit this kind of data. To Incorporate the spatial dimension into the count data models, Geographically Weighted Poisson Regression (GWPR), Geographically Weighted Negative Binomial Regression (GWNBR) and Geographically Weighted Zero-Inflated Poisson Regression (GWZIPR) have been developed, but the zero-inflation part of the negative binomial distribution is undeveloped in order to incorporate the overdispersion and the excess of zeros, as was at the beginning of the COVID-19 pandemic, whereas some places were having an outbreak of cases and in others places, there were no cases yet. Therefore, we propose a Geographically Weighted Zero-Inflated Negative Binomial Regression (GWZINBR) model which can be considered a general case for count data, since locally it can become a GWZIPR, GWNBR or a GWPR model. We applied this model to simulated data and to the cases of COVID-19 in South Korea at the beginning of the pandemic in 2020 and the results showed a better understanding of the phenomenon compared to the GWNBR model.
期刊介绍:
Spatial Statistics publishes articles on the theory and application of spatial and spatio-temporal statistics. It favours manuscripts that present theory generated by new applications, or in which new theory is applied to an important practical case. A purely theoretical study will only rarely be accepted. Pure case studies without methodological development are not acceptable for publication.
Spatial statistics concerns the quantitative analysis of spatial and spatio-temporal data, including their statistical dependencies, accuracy and uncertainties. Methodology for spatial statistics is typically found in probability theory, stochastic modelling and mathematical statistics as well as in information science. Spatial statistics is used in mapping, assessing spatial data quality, sampling design optimisation, modelling of dependence structures, and drawing of valid inference from a limited set of spatio-temporal data.