Estimation of HIV prevalence at the ZIP code-level in Atlanta, Georgia: Bayesian prediction modeling using passive surveillance data and social determinants of disease spreading
{"title":"Estimation of HIV prevalence at the ZIP code-level in Atlanta, Georgia: Bayesian prediction modeling using passive surveillance data and social determinants of disease spreading","authors":"Enrique M. Saldarriaga, Anirban Basu","doi":"10.1016/j.puhe.2024.10.019","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This study aims to predict the number of undiagnosed HIV cases at the ZIP Code-level in Atlanta, Georgia, based on publicly available information.</div></div><div><h3>Study design</h3><div>Statistical modeling.</div></div><div><h3>Methods</h3><div>We fitted a Bayesian hierarchical Binomial model to county-level estimates of the passive-surveillance-system. The denominator was the true total HIV cases arising from a Negative Binomial distribution. The trial probability, known as ascertainment probability, depended on socio-economic determinants of HIV, retained via feature-selection algorithms. Data were obtained from CDC's HIV report for End of the HIV Epidemic and the American Community Survey. The prediction model was assessed out-of-sample in Georgia counties. We combined socio-economic data with the posterior predictive distribution of the coefficients to predict the mean ascertainment probability and total HIV cases at the ZIP Code-level. These estimates were spatially smoothed and aggregated at the county-level for secondary validations.</div></div><div><h3>Results</h3><div>The county-level model showed good mixing properties and predictive accuracy. The mean ascertainment probability calibrated to the ZIP Code-level varied from 78.4% (95% credible interval: 24.4%–99.3%) to 93.8% (95%CI: 80.6%–99.8%). Further, the predicted undiagnosed HIV cases ranged between 12 (95%CI: 6–19; ZIP Code 30322) to 1603 (95%CI 1209–1968; ZIP Code 30318).</div></div><div><h3>Conclusions</h3><div>Our findings provide a more complete picture of the relative burden of HIV across ZIP codes. Such information can be used by Local Health Departments to identify underserved areas and allocate resources accordingly. Furthermore, our methodological approach can be applied to complement the information obtained from passive surveillance, especially when more resource-intensive approaches are not available or are unfeasible to employ.</div></div>","PeriodicalId":49651,"journal":{"name":"Public Health","volume":"237 ","pages":"Pages 282-290"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Public Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S003335062400430X","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
This study aims to predict the number of undiagnosed HIV cases at the ZIP Code-level in Atlanta, Georgia, based on publicly available information.
Study design
Statistical modeling.
Methods
We fitted a Bayesian hierarchical Binomial model to county-level estimates of the passive-surveillance-system. The denominator was the true total HIV cases arising from a Negative Binomial distribution. The trial probability, known as ascertainment probability, depended on socio-economic determinants of HIV, retained via feature-selection algorithms. Data were obtained from CDC's HIV report for End of the HIV Epidemic and the American Community Survey. The prediction model was assessed out-of-sample in Georgia counties. We combined socio-economic data with the posterior predictive distribution of the coefficients to predict the mean ascertainment probability and total HIV cases at the ZIP Code-level. These estimates were spatially smoothed and aggregated at the county-level for secondary validations.
Results
The county-level model showed good mixing properties and predictive accuracy. The mean ascertainment probability calibrated to the ZIP Code-level varied from 78.4% (95% credible interval: 24.4%–99.3%) to 93.8% (95%CI: 80.6%–99.8%). Further, the predicted undiagnosed HIV cases ranged between 12 (95%CI: 6–19; ZIP Code 30322) to 1603 (95%CI 1209–1968; ZIP Code 30318).
Conclusions
Our findings provide a more complete picture of the relative burden of HIV across ZIP codes. Such information can be used by Local Health Departments to identify underserved areas and allocate resources accordingly. Furthermore, our methodological approach can be applied to complement the information obtained from passive surveillance, especially when more resource-intensive approaches are not available or are unfeasible to employ.
本研究旨在根据公开信息预测佐治亚州亚特兰大市邮政编码级别的未确诊 HIV 病例数。研究设计统计建模方法我们将贝叶斯分层二项式模型拟合到县级被动监测系统的估计值中。分母是由负二项分布产生的真实 HIV 病例总数。试验概率,即确定概率,取决于艾滋病毒的社会经济决定因素,并通过特征选择算法加以保留。数据来源于美国疾病预防控制中心的《结束艾滋病流行报告》和美国社区调查。预测模型在佐治亚州各县进行了样本外评估。我们将社会经济数据与系数的后验预测分布相结合,以预测平均确定概率和邮政编码级别的 HIV 病例总数。结果县级模型显示出良好的混合特性和预测准确性。校准到 ZIP Code 级别的平均确诊概率从 78.4%(95% 可信区间:24.4%-99.3%)到 93.8%(95%CI:80.6%-99.8%)不等。此外,预测的未诊断 HIV 病例从 12 例(95% 置信区间:6-19 例;邮政编码 30322)到 1603 例(95% 置信区间 1209-1968;邮政编码 30318)不等。地方卫生部门可以利用这些信息来确定服务不足的地区,并相应地分配资源。此外,我们的方法可用于补充从被动监测中获得的信息,尤其是在没有资源密集型方法或无法采用资源密集型方法的情况下。
期刊介绍:
Public Health is an international, multidisciplinary peer-reviewed journal. It publishes original papers, reviews and short reports on all aspects of the science, philosophy, and practice of public health.