A BAYESIAN HIERARCHICAL SMALL AREA POPULATION MODEL ACCOUNTING FOR DATA SOURCE SPECIFIC METHODOLOGIES FROM AMERICAN COMMUNITY SURVEY, POPULATION ESTIMATES PROGRAM, AND DECENNIAL CENSUS DATA.

IF 1.3 4区 数学 Q2 STATISTICS & PROBABILITY Annals of Applied Statistics Pub Date : 2024-06-01 Epub Date: 2024-04-05 DOI:10.1214/23-aoas1849
Emily N Peterson, Rachel C Nethery, Tullia Padellini, Jarvis T Chen, Brent A Coull, Frédéric B Piel, Jon Wakefield, Marta Blangiardo, Lance A Waller
{"title":"A BAYESIAN HIERARCHICAL SMALL AREA POPULATION MODEL ACCOUNTING FOR DATA SOURCE SPECIFIC METHODOLOGIES FROM AMERICAN COMMUNITY SURVEY, POPULATION ESTIMATES PROGRAM, AND DECENNIAL CENSUS DATA.","authors":"Emily N Peterson, Rachel C Nethery, Tullia Padellini, Jarvis T Chen, Brent A Coull, Frédéric B Piel, Jon Wakefield, Marta Blangiardo, Lance A Waller","doi":"10.1214/23-aoas1849","DOIUrl":null,"url":null,"abstract":"<p><p>Small area population counts are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area population counts are published by the United States Census Bureau (USCB) in the form of the decennial census counts, intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although there are significant relationships between these three data sources, there are important contrasts in data collection, data availability, and processing methodologies such that each set of reported population counts may be subject to different sources and magnitudes of error. Additionally, these data sources do not report identical small area population counts due to post-survey adjustments specific to each data source. Consequently, in public health studies, small area disease/mortality rates may differ depending on which data source is used for denominator data. To accurately estimate annual small area population counts <i>and their</i> associated uncertainties, we present a Bayesian population (BPop) model, which fuses information from all three USCB sources, accounting for data source specific methodologies and associated errors. We produce comprehensive small area race-stratified estimates of the true population, and associated uncertainties, given the observed trends in all three USCB population estimates. The main features of our framework are: (1) a single model integrating multiple data sources, (2) accounting for data source specific data generating mechanisms and specifically accounting for data source specific errors, and (3) prediction of population counts for years without USCB reported data. We focus our study on the Black and White only populations for 159 counties of Georgia and produce estimates for years 2006-2023. We compare BPop population estimates to decennial census counts, PEP annual counts, and ACS multi-year estimates. Additionally, we illustrate and explain the different types of data source specific errors. Lastly, we compare model performance using simulations and validation exercises. Our Bayesian population model can be extended to other applications at smaller spatial granularity and for demographic subpopulations defined further by race, age, and sex, and/or for other geographical regions.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"18 2","pages":"1565-1595"},"PeriodicalIF":1.3000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423836/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/23-aoas1849","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/4/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Small area population counts are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area population counts are published by the United States Census Bureau (USCB) in the form of the decennial census counts, intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although there are significant relationships between these three data sources, there are important contrasts in data collection, data availability, and processing methodologies such that each set of reported population counts may be subject to different sources and magnitudes of error. Additionally, these data sources do not report identical small area population counts due to post-survey adjustments specific to each data source. Consequently, in public health studies, small area disease/mortality rates may differ depending on which data source is used for denominator data. To accurately estimate annual small area population counts and their associated uncertainties, we present a Bayesian population (BPop) model, which fuses information from all three USCB sources, accounting for data source specific methodologies and associated errors. We produce comprehensive small area race-stratified estimates of the true population, and associated uncertainties, given the observed trends in all three USCB population estimates. The main features of our framework are: (1) a single model integrating multiple data sources, (2) accounting for data source specific data generating mechanisms and specifically accounting for data source specific errors, and (3) prediction of population counts for years without USCB reported data. We focus our study on the Black and White only populations for 159 counties of Georgia and produce estimates for years 2006-2023. We compare BPop population estimates to decennial census counts, PEP annual counts, and ACS multi-year estimates. Additionally, we illustrate and explain the different types of data source specific errors. Lastly, we compare model performance using simulations and validation exercises. Our Bayesian population model can be extended to other applications at smaller spatial granularity and for demographic subpopulations defined further by race, age, and sex, and/or for other geographical regions.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
根据美国社区调查、人口估计计划和十年一次的人口普查数据,建立一个考虑到数据源特定方法的贝叶斯分层小地区人口模型。
小地区人口统计是许多流行病学研究的必要条件,但其质量和准确性往往得不到评估。在美国,小地区人口统计由美国人口普查局(USCB)以十年一次的人口普查计数、普查间人口预测(PEP)和美国社区调查(ACS)估计值的形式发布。虽然这三个数据源之间存在重要关系,但在数据收集、数据可用性和处理方法方面存在重要差异,因此每套报告的人口数量可能会受到不同来源和不同程度误差的影响。此外,由于每个数据源都会进行特定的调查后调整,因此这些数据源报告的小地区人口数并不完全相同。因此,在公共卫生研究中,小地区疾病/死亡率可能会因分母数据使用的数据源不同而不同。为了准确估算年度小地区人口数量及其相关的不确定性,我们提出了一个贝叶斯人口(BPop)模型,该模型融合了 USCB 所有三个来源的信息,并考虑了数据源特定的方法和相关误差。考虑到所有三个 USCB 人口估计中观察到的趋势,我们对真实人口及其相关不确定性进行了全面的小区域种族分层估计。我们的框架的主要特点是(1) 整合多个数据源的单一模型,(2) 考虑到数据源特定的数据生成机制,特别是考虑到数据源特定的误差,以及 (3) 对没有 USCB 报告数据的年份的人口数量进行预测。我们的研究重点是佐治亚州 159 个县的黑人和白人人口,并得出 2006-2023 年的估计值。我们将 BPop 人口估计值与十年一次的人口普查计数、PEP 年度计数和 ACS 多年估计值进行了比较。此外,我们还说明并解释了不同类型的数据源特定误差。最后,我们通过模拟和验证练习来比较模型的性能。我们的贝叶斯人口模型可扩展到其他应用领域,如更小的空间粒度、按种族、年龄和性别进一步定义的人口亚群,以及/或其他地理区域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Annals of Applied Statistics
Annals of Applied Statistics 社会科学-统计学与概率论
CiteScore
3.10
自引率
5.60%
发文量
131
审稿时长
6-12 weeks
期刊介绍: Statistical research spans an enormous range from direct subject-matter collaborations to pure mathematical theory. The Annals of Applied Statistics, the newest journal from the IMS, is aimed at papers in the applied half of this range. Published quarterly in both print and electronic form, our goal is to provide a timely and unified forum for all areas of applied statistics.
期刊最新文献
CAUSAL HEALTH IMPACTS OF POWER PLANT EMISSION CONTROLS UNDER MODELED AND UNCERTAIN PHYSICAL PROCESS INTERFERENCE. PATIENT RECRUITMENT USING ELECTRONIC HEALTH RECORDS UNDER SELECTION BIAS: A TWO-PHASE SAMPLING FRAMEWORK. A NONPARAMETRIC MIXED-EFFECTS MIXTURE MODEL FOR PATTERNS OF CLINICAL MEASUREMENTS ASSOCIATED WITH COVID-19. A bootstrap model comparison test for identifying genes with context-specific patterns of genetic regulation. BIVARIATE FUNCTIONAL PATTERNS OF LIFETIME MEDICARE COSTS AMONG ESRD PATIENTS.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1