Guilherme Ludwig , Yuan Wang , Tingjin Chu , Haonan Wang , Jun Zhu
{"title":"Bayesian analysis and variable selection for spatial count data with an application to Rio de Janeiro gun violence","authors":"Guilherme Ludwig , Yuan Wang , Tingjin Chu , Haonan Wang , Jun Zhu","doi":"10.1016/j.spasta.2025.100890","DOIUrl":null,"url":null,"abstract":"<div><div>Statistical analysis has been successfully applied to crime data for identification of crime hot spots and prediction of future crimes. In this paper, our main objective is to identify key factors for gun violence in Rio de Janeiro and study the relationship between these key factors and the number of reported events. We use a Bayesian hierarchical stochastic Poisson regression model for spatial counts, which enables us to address the over-dispersed count data and to handle the spatial correlation. Moreover, we propose a variable selection method for key factor identification based on the spike-and-slab prior distribution for the regression coefficients. A new Gibbs sampler is developed for sampling from the posterior distributions with the help of augmentation of Pólya-Gamma auxiliary variables. Simulation studies are used to demonstrate the performance of our proposed approach. Our analysis of the gun violence data in Rio de Janeiro reveals the relationship between violence events and socio-demographic covariates as well as an interpretable spatial random effect that accounts for unmeasured covariate information.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"67 ","pages":"Article 100890"},"PeriodicalIF":2.1000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial Statistics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675325000120","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Statistical analysis has been successfully applied to crime data for identification of crime hot spots and prediction of future crimes. In this paper, our main objective is to identify key factors for gun violence in Rio de Janeiro and study the relationship between these key factors and the number of reported events. We use a Bayesian hierarchical stochastic Poisson regression model for spatial counts, which enables us to address the over-dispersed count data and to handle the spatial correlation. Moreover, we propose a variable selection method for key factor identification based on the spike-and-slab prior distribution for the regression coefficients. A new Gibbs sampler is developed for sampling from the posterior distributions with the help of augmentation of Pólya-Gamma auxiliary variables. Simulation studies are used to demonstrate the performance of our proposed approach. Our analysis of the gun violence data in Rio de Janeiro reveals the relationship between violence events and socio-demographic covariates as well as an interpretable spatial random effect that accounts for unmeasured covariate information.
期刊介绍:
Spatial Statistics publishes articles on the theory and application of spatial and spatio-temporal statistics. It favours manuscripts that present theory generated by new applications, or in which new theory is applied to an important practical case. A purely theoretical study will only rarely be accepted. Pure case studies without methodological development are not acceptable for publication.
Spatial statistics concerns the quantitative analysis of spatial and spatio-temporal data, including their statistical dependencies, accuracy and uncertainties. Methodology for spatial statistics is typically found in probability theory, stochastic modelling and mathematical statistics as well as in information science. Spatial statistics is used in mapping, assessing spatial data quality, sampling design optimisation, modelling of dependence structures, and drawing of valid inference from a limited set of spatio-temporal data.