首页 > 最新文献

Journal of the Royal Statistical Society Series A-Statistics in Society最新文献

英文 中文
Mapping socio-economic status using mixed data: a hierarchical Bayesian approach. 利用混合数据绘制社会经济地位图:分层贝叶斯方法。
IF 1.5 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-07-14 Epub Date: 2024-08-20 DOI: 10.1093/jrsssa/qnae080
Gabrielle Virgili-Gervais, Alexandra M Schmidt, Honor Bixby, Alicia Cavanaugh, George Owusu, Samuel Agyei-Mensah, Brian Robinson, Jill Baumgartner

We propose a Bayesian hierarchical model to estimate a socio-economic status (SES) index based on mixed dichotomous and continuous variables. In particular, we extend Quinn's ([2004]. Bayesian factor analysis for mixed ordinal and continuous responses. Political Analysis, 12(4), 338-353. https://doi.org/10.1093/pan/mph022) and Schliep and Hoeting's ([2013]. Multilevel latent Gaussian process model for mixed discrete and continuous multivariate response data. Journal of Agricultural, Biological, and Environmental Statistics, 18(4), 492-513. https://doi.org/10.1007/s13253-013-0136-z) factor analysis models for mixed dichotomous and continuous variables by allowing a spatial hierarchical structure of key parameters of the model. Unlike most SES assessment models proposed in the literature, the hierarchical nature of this model enables the use of census observations at the household level without needing to aggregate any information a priori. Therefore, it better accommodates the variability of the SES between census tracts and the number of households per area. The proposed model is used in the estimation of a socio-economic index using 10% of the 2010 Ghana census in the Greater Accra Metropolitan area. Out of the 20 observed variables, the number of people per room, access to water piping and flushable toilets differentiated high and low SES areas the best.

我们提出了一个基于混合二分类变量和连续变量的贝叶斯层次模型来估计社会经济地位(SES)指数。特别地,我们扩展了Quinn的([2004])。混合有序和连续响应的贝叶斯因子分析。政治分析,12(4),338-353。https://doi.org/10.1093/pan/mph022)和Schliep and Hoeting的[2013]。混合离散和连续多元响应数据的多水平隐高斯过程模型。农业生物与环境统计,18(4),492-513。https://doi.org/10.1007/s13253-013-0136-z)因子分析模型的混合二分类和连续变量,允许一个空间层次结构的关键参数的模型。与文献中提出的大多数社会经济地位评估模型不同,该模型的分层性质使其能够在家庭层面上使用人口普查观察结果,而无需先验地汇总任何信息。因此,它更好地适应了人口普查区之间的社会经济地位和每个地区的家庭数量的变化。该模型使用2010年加纳大阿克拉大都会地区人口普查数据的10%来估计社会经济指数。在观察到的20个变量中,每个房间的人数、是否有水管和可冲水厕所是区分高SES和低SES区域的最好方法。
{"title":"Mapping socio-economic status using mixed data: a hierarchical Bayesian approach.","authors":"Gabrielle Virgili-Gervais, Alexandra M Schmidt, Honor Bixby, Alicia Cavanaugh, George Owusu, Samuel Agyei-Mensah, Brian Robinson, Jill Baumgartner","doi":"10.1093/jrsssa/qnae080","DOIUrl":"10.1093/jrsssa/qnae080","url":null,"abstract":"<p><p>We propose a Bayesian hierarchical model to estimate a socio-economic status (SES) index based on mixed dichotomous and continuous variables. In particular, we extend Quinn's ([2004]. Bayesian factor analysis for mixed ordinal and continuous responses. <i>Political Analysis, 12</i>(4), 338-353. https://doi.org/10.1093/pan/mph022) and Schliep and Hoeting's ([2013]. Multilevel latent Gaussian process model for mixed discrete and continuous multivariate response data. <i>Journal of Agricultural, Biological, and Environmental Statistics, 18</i>(4), 492-513. https://doi.org/10.1007/s13253-013-0136-z) factor analysis models for mixed dichotomous and continuous variables by allowing a spatial hierarchical structure of key parameters of the model. Unlike most SES assessment models proposed in the literature, the hierarchical nature of this model enables the use of census observations at the household level without needing to aggregate any information <i>a priori</i>. Therefore, it better accommodates the variability of the SES between census tracts and the number of households per area. The proposed model is used in the estimation of a socio-economic index using 10% of the 2010 Ghana census in the Greater Accra Metropolitan area. Out of the 20 observed variables, the number of people per room, access to water piping and flushable toilets differentiated high and low SES areas the best.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":"859-874"},"PeriodicalIF":1.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7617442/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143544329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
kpop: a kernel balancing approach for reducing specification assumptions in survey weighting. Kpop:在调查加权中减少规范假设的核平衡方法。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-07-01 Epub Date: 2024-09-02 DOI: 10.1093/jrsssa/qnae082
Erin Hartman, Chad Hazlett, Ciara Sterbenz

With the precipitous decline in response rates, researchers and pollsters have been left with highly nonrepresentative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables X must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly used calibration weights-which make the weighted mean of X in the sample equal that of the population-only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of X are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix X with a kernel matrix, K encoding high-order information about X . Weights are then found to make the weighted average row of K among sampled units approximately equal to that of the target population. This produces good calibration on a wide range of smooth functions of X , without relying on the user to decide which X or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 US presidential election.

随着回复率的急剧下降,研究人员和民意测验专家留下了高度不具代表性的样本,依靠构建的权重来使这些样本代表期望的目标人群。尽管从业者使用有价值的专家知识来选择X必须调整的变量,但他们很少为这些变量与响应过程或结果相关的特定功能形式辩护。不幸的是,通常使用的校准权重——使样本中X的加权平均值等于总体的加权平均值——只有在X的线性函数无法解释的部分结果和响应过程是独立的情况下才能确保正确的调整。为了减轻这种功能形式依赖,我们描述了人口加权(kpop)的内核平衡。这种方法将设计矩阵X替换为核矩阵,K编码关于X的高阶信息。然后找到权重,使抽样单位中K的加权平均行近似等于目标总体的加权平均行。这对X的各种平滑函数产生了良好的校准,而不依赖于用户决定包含哪个X或其中的哪些函数。我们描述了该方法,并通过应用于2016年美国总统大选的民意调查数据来说明它。
{"title":"<i>kpop</i>: a kernel balancing approach for reducing specification assumptions in survey weighting.","authors":"Erin Hartman, Chad Hazlett, Ciara Sterbenz","doi":"10.1093/jrsssa/qnae082","DOIUrl":"10.1093/jrsssa/qnae082","url":null,"abstract":"<p><p>With the precipitous decline in response rates, researchers and pollsters have been left with highly nonrepresentative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables <math><mrow><mi>X</mi></mrow> </math> must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly used calibration weights-which make the weighted mean of <math><mrow><mi>X</mi></mrow> </math> in the sample equal that of the population-only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of <math><mrow><mi>X</mi></mrow> </math> are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (<i>kpop</i>). This approach replaces the design matrix <math><mrow><mtext>X</mtext></mrow> </math> with a kernel matrix, <math><mrow><mtext>K</mtext></mrow> </math> encoding high-order information about <math><mrow><mtext>X</mtext></mrow> </math> . Weights are then found to make the weighted average row of <math><mrow><mtext>K</mtext></mrow> </math> among sampled units approximately equal to that of the target population. This produces good calibration on a wide range of smooth functions of <math><mrow><mi>X</mi></mrow> </math> , without relying on the user to decide which <math><mrow><mi>X</mi></mrow> </math> or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 US presidential election.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 3","pages":"875-895"},"PeriodicalIF":1.6,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12352454/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graphical displays and related statistical measures of health disparities between groups in complex sample surveys. 复杂抽样调查中群体间健康差异的图形显示和相关统计措施。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-05-22 DOI: 10.1093/jrsssa/qnaf044
Mark Louie Ramos, Barry Graubard, Joseph Gastwirth

Different methods for describing health disparities in the distributions of continuous measured health-related variables among groups provide more insight into the nature and impact of the disparities than comparing measures of central tendency. Transformations of the Lorenz curve and analogues of the Gini index used in the analysis of income inequality are adapted to provide graphical and analytical measures of health disparities. Akin to the classical Peters-Belson regression method for partitioning a disparity into a component explained by group differences in a set of covariates and an unexplained component, a new modified Lorenz curve is proposed. The estimation of these curves/measures is adapted for data obtained from surveys with complex sample weighted designs. The statistical properties of sample weighted estimators of the proposed measures and their bootstrap variances are explored through simulation studies. Applications are demonstrated using BMI and blood lead levels among race/ethnic groups of adult females and children, respectively, from the 2013-2018 and 1988-1994 US National Health and Nutrition Examination Surveys. Another application examines disparities in distance to nearest acute care hospital among census blocks in the US state of New York grouped by their level of urbanicity using US census data and the American Hospital Association survey.

不同的方法描述连续测量的健康相关变量在组间分布中的健康差异,比比较集中趋势的测量方法更能深入了解差异的性质和影响。对洛伦兹曲线的变换和用于分析收入不平等的基尼指数的类似物进行了调整,以提供健康差距的图形和分析措施。类似于经典的彼得斯-贝尔森回归方法,将差异划分为由一组协变量中的群体差异解释的分量和未解释的分量,提出了一种新的修正洛伦兹曲线。这些曲线/测量的估计适用于从具有复杂样本加权设计的调查中获得的数据。通过仿真研究,探讨了所提测度的样本加权估计量及其自举方差的统计性质。分别使用2013-2018年和1988-1994年美国国家健康与营养检查调查中成年女性和儿童种族/族裔群体的BMI和血铅水平来证明应用。另一个应用程序使用美国人口普查数据和美国医院协会的调查,根据城市化水平,检查美国纽约州人口普查街区到最近的急性护理医院的距离差异。
{"title":"Graphical displays and related statistical measures of health disparities between groups in complex sample surveys.","authors":"Mark Louie Ramos, Barry Graubard, Joseph Gastwirth","doi":"10.1093/jrsssa/qnaf044","DOIUrl":"10.1093/jrsssa/qnaf044","url":null,"abstract":"<p><p>Different methods for describing health disparities in the distributions of continuous measured health-related variables among groups provide more insight into the nature and impact of the disparities than comparing measures of central tendency. Transformations of the Lorenz curve and analogues of the Gini index used in the analysis of income inequality are adapted to provide graphical and analytical measures of health disparities. Akin to the classical Peters-Belson regression method for partitioning a disparity into a component explained by group differences in a set of covariates and an unexplained component, a new modified Lorenz curve is proposed. The estimation of these curves/measures is adapted for data obtained from surveys with complex sample weighted designs. The statistical properties of sample weighted estimators of the proposed measures and their bootstrap variances are explored through simulation studies. Applications are demonstrated using BMI and blood lead levels among race/ethnic groups of adult females and children, respectively, from the 2013-2018 and 1988-1994 US National Health and Nutrition Examination Surveys. Another application examines disparities in distance to nearest acute care hospital among census blocks in the US state of New York grouped by their level of urbanicity using US census data and the American Hospital Association survey.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341090/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144849517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian zero-inflated spatially varying coefficients model for overdispersed binomial data. 过分散二项数据的贝叶斯零膨胀空间变系数模型。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-05-21 DOI: 10.1093/jrsssa/qnaf056
Chun-Che Wen, Rajib Paul, Kelly J Hunt, A James O'Malley, Hong Li, Elizabeth Hill, Angela M Malek, Brian Neelon

Cardiometabolic risk factors (CRFs) during pregnancy are early indicators of maternal diseases, such as stroke and type 2 diabetes. The total number of CRFs typically takes the form of binomial counts that exhibit overdispersion and zero inflation due to correlations among the underlying CRFs. Motivated by an examination of spatiotemporal trends in five CRFs among pregnant women in the U.S. state of South Carolina during the COVID-19 pandemic, we developed a zero-inflated beta-binomial model within a spatiotemporal framework. This model combines a point mass at zero to account for zero inflation and a beta-binomial distribution to model the remaining CRF counts. Given the notable racial disparities in CRFs during pregnancy that vary across the state over time, we incorporate a spatially varying coefficient model to explore the complex relationships between CRFs and geographic and temporal disparities among non-Hispanic White and non-Hispanic Black women. For posterior inference, we developed an efficient hybrid Markov Chain Monte Carlo algorithm that relies on easily sampled Gibbs and Metropolis-Hastings steps. Our analysis of CRFs in South Carolina reveals that certain counties, such as Chesterfield and Clarendon, exhibit gaps in racial health disparities, making them prime candidates for community-level interventions aimed at reducing these disparities.

怀孕期间的心脏代谢危险因素(crf)是产妇疾病的早期指标,如中风和2型糖尿病。crf的总数通常采用二项计数的形式,由于基础crf之间的相关性,表现出过度分散和零通货膨胀。在对COVID-19大流行期间美国南卡罗来纳州孕妇的五个crf的时空趋势进行研究的启发下,我们在时空框架内开发了一个零膨胀的β -二项模型。该模型结合了零点点质量来解释零膨胀和β二项分布来模拟剩余的CRF计数。考虑到怀孕期间crf的显著种族差异随时间而变化,我们结合了一个空间变化系数模型来探索非西班牙裔白人和非西班牙裔黑人妇女的crf与地理和时间差异之间的复杂关系。对于后验推理,我们开发了一种高效的混合马尔可夫链蒙特卡罗算法,该算法依赖于易于采样的Gibbs和Metropolis-Hastings步骤。我们对南卡罗来纳州crf的分析表明,某些县,如切斯特菲尔德和克拉伦登,在种族健康差异方面表现出差距,这使它们成为旨在减少这些差异的社区一级干预措施的主要候选者。
{"title":"A Bayesian zero-inflated spatially varying coefficients model for overdispersed binomial data.","authors":"Chun-Che Wen, Rajib Paul, Kelly J Hunt, A James O'Malley, Hong Li, Elizabeth Hill, Angela M Malek, Brian Neelon","doi":"10.1093/jrsssa/qnaf056","DOIUrl":"10.1093/jrsssa/qnaf056","url":null,"abstract":"<p><p>Cardiometabolic risk factors (CRFs) during pregnancy are early indicators of maternal diseases, such as stroke and type 2 diabetes. The total number of CRFs typically takes the form of binomial counts that exhibit overdispersion and zero inflation due to correlations among the underlying CRFs. Motivated by an examination of spatiotemporal trends in five CRFs among pregnant women in the U.S. state of South Carolina during the COVID-19 pandemic, we developed a zero-inflated beta-binomial model within a spatiotemporal framework. This model combines a point mass at zero to account for zero inflation and a beta-binomial distribution to model the remaining CRF counts. Given the notable racial disparities in CRFs during pregnancy that vary across the state over time, we incorporate a spatially varying coefficient model to explore the complex relationships between CRFs and geographic and temporal disparities among non-Hispanic White and non-Hispanic Black women. For posterior inference, we developed an efficient hybrid Markov Chain Monte Carlo algorithm that relies on easily sampled Gibbs and Metropolis-Hastings steps. Our analysis of CRFs in South Carolina reveals that certain counties, such as Chesterfield and Clarendon, exhibit gaps in racial health disparities, making them prime candidates for community-level interventions aimed at reducing these disparities.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12625296/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Authors' reply to the Discussion of 'Methods for estimating the exposure-response curve to inform the new safety standards for fine particulate matter'. 关于“为新细颗粒物安全标准提供信息的暴露-反应曲线估算方法”的讨论的答复。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-05-21 eCollection Date: 2025-10-01 DOI: 10.1093/jrsssa/qnaf057
Michael Cork, Daniel Mork, Francesca Dominici
{"title":"Authors' reply to the Discussion of 'Methods for estimating the exposure-response curve to inform the new safety standards for fine particulate matter'.","authors":"Michael Cork, Daniel Mork, Francesca Dominici","doi":"10.1093/jrsssa/qnaf057","DOIUrl":"10.1093/jrsssa/qnaf057","url":null,"abstract":"","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 4","pages":"995-1002"},"PeriodicalIF":1.6,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating racial and ethnic healthcare quality disparities using exploratory item response theory and latent class item response theory models. 使用探索性项目反应理论和潜在类别项目反应理论模型估计种族和民族医疗保健质量差异。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-04-01 DOI: 10.1093/jrsssa/qnaf033
Sharon-Lise Normand, Katya Zelevinsky, Marcela Horvitz-Lennon

Healthcare quality metrics refer to a variety of measures used to characterize what should have been done or not done for a patient or the health consequences of what was or was not done. When estimating healthcare quality, many metrics are measured and combined to provide an overall estimate either at the patient level or at higher levels, such as the provider organization or insurer. Racial and ethnic disparities are defined as the mean difference in quality between minorities and Whites not justified by underlying health conditions or patient preferences. Several statistical features of healthcare quality data have been ignored: quality is a theoretical construct not directly observable; quality metrics are measured on different scales or, if measured on the same scale, have different baseline rates; the construct may be multidimensional; and metrics are correlated within-individuals. Balancing health differences across race and ethnicity groups is challenging due to confounding. We provide an approach addressing these features, utilizing exploratory multidimensional item response theory (IRT) models and latent class IRT models to estimate quality, and optimization-based matching to adjust for confounding among the race and ethnicity groups. Quality metrics measured on 93,000 adults with schizophrenia residing in five US states illustrate approaches.

医疗保健质量指标是指用于描述应该为患者做什么或不做什么,或做什么或不做什么对健康造成的后果的各种度量。在评估医疗保健质量时,要测量并组合许多指标,以提供患者级别或更高级别(如提供商组织或保险公司)的总体评估。种族和民族差异被定义为少数民族和白人之间的平均质量差异,而不是由潜在的健康状况或患者偏好来证明。医疗质量数据的几个统计特征被忽视了:质量是一个不能直接观察到的理论结构;在不同的尺度上测量质量度量,或者,如果在相同的尺度上测量,有不同的基线率;这个结构可能是多维的;指标在个体内部是相关的。由于混淆,平衡种族和族裔群体之间的健康差异具有挑战性。我们提供了一种解决这些特征的方法,利用探索性多维项目反应理论(IRT)模型和潜在类别IRT模型来估计质量,并基于优化的匹配来调整种族和民族群体之间的混淆。对居住在美国五个州的93,000名精神分裂症患者进行的质量指标测量说明了方法。
{"title":"Estimating racial and ethnic healthcare quality disparities using exploratory item response theory and latent class item response theory models.","authors":"Sharon-Lise Normand, Katya Zelevinsky, Marcela Horvitz-Lennon","doi":"10.1093/jrsssa/qnaf033","DOIUrl":"https://doi.org/10.1093/jrsssa/qnaf033","url":null,"abstract":"<p><p>Healthcare quality metrics refer to a variety of measures used to characterize what should have been done or not done for a patient or the health consequences of what was or was not done. When estimating healthcare quality, many metrics are measured and combined to provide an overall estimate either at the patient level or at higher levels, such as the provider organization or insurer. Racial and ethnic disparities are defined as the mean difference in quality between minorities and Whites not justified by underlying health conditions or patient preferences. Several statistical features of healthcare quality data have been ignored: quality is a theoretical construct not directly observable; quality metrics are measured on different scales or, if measured on the same scale, have different baseline rates; the construct may be multidimensional; and metrics are correlated within-individuals. Balancing health differences across race and ethnicity groups is challenging due to confounding. We provide an approach addressing these features, utilizing exploratory multidimensional item response theory (IRT) models and latent class IRT models to estimate quality, and optimization-based matching to adjust for confounding among the race and ethnicity groups. Quality metrics measured on 93,000 adults with schizophrenia residing in five US states illustrate approaches.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377680/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Methods for Estimating the Exposure-Response Curve to Inform the New Safety Standards for Fine Particulate Matter. 暴露-反应曲线估算方法为新细颗粒物安全标准提供依据。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2025-01-16 DOI: 10.1093/jrsssa/qnaf004
Michael Cork, Daniel Mork, Francesca Dominici

Exposure to fine particulate matter (PM2.5) poses significant health risks and accurately determining the shape of the relationship between PM2.5 and health outcomes has crucial policy implications. Although various statistical methods exist to estimate this exposure-response curve (ERC), few studies have compared their performance under plausible data-generating scenarios. This study compares seven commonly used ERC estimators across 72 exposure-response and confounding scenarios via simulation. Additionally, we apply these methods to estimate the ERC between long-term PM2.5 exposure and all-cause mortality using data from over 68 million Medicare beneficiaries in the United States. Our simulation indicates that regression methods not placed within a causal inference framework are unsuitable when anticipating heterogeneous exposure effects. Under the setting of a large sample size and unknown ERC functional form, we recommend utilizing causal inference methods that allow for nonlinear ERCs. In our data application, we observe a nonlinear relationship between annual average PM2.5 and all-cause mortality in the Medicare population, with a sharp increase in relative mortality at low PM2.5 concentrations. Our findings suggest that stricter limits on PM2.5 could avert numerous premature deaths. To facilitate the utilization of our results, we provide publicly available, reproducible code on Github for every step of the analysis.

暴露于细颗粒物(PM2.5)会带来重大的健康风险,准确确定PM2.5与健康结果之间的关系具有至关重要的政策意义。尽管存在各种统计方法来估计这种暴露-反应曲线(ERC),但很少有研究比较它们在合理的数据生成场景下的性能。本研究通过模拟比较了72种暴露-反应和混杂情景中7种常用的ERC估计器。此外,我们利用来自美国6800多万医疗保险受益人的数据,应用这些方法来估计长期PM2.5暴露与全因死亡率之间的ERC。我们的模拟表明,在预测异质暴露效应时,未置于因果推理框架内的回归方法是不合适的。在大样本量和未知ERC函数形式的情况下,我们建议使用允许非线性ERC的因果推理方法。在我们的数据应用中,我们观察到医疗保险人群的年平均PM2.5与全因死亡率之间存在非线性关系,低PM2.5浓度下的相对死亡率急剧上升。我们的研究结果表明,更严格的PM2.5限制可以避免许多过早死亡。为了方便使用我们的结果,我们在Github上为分析的每个步骤提供了公开可用的、可复制的代码。
{"title":"Methods for Estimating the Exposure-Response Curve to Inform the New Safety Standards for Fine Particulate Matter.","authors":"Michael Cork, Daniel Mork, Francesca Dominici","doi":"10.1093/jrsssa/qnaf004","DOIUrl":"10.1093/jrsssa/qnaf004","url":null,"abstract":"<p><p>Exposure to fine particulate matter (PM<sub>2.5</sub>) poses significant health risks and accurately determining the shape of the relationship between PM<sub>2.5</sub> and health outcomes has crucial policy implications. Although various statistical methods exist to estimate this exposure-response curve (ERC), few studies have compared their performance under plausible data-generating scenarios. This study compares seven commonly used ERC estimators across 72 exposure-response and confounding scenarios via simulation. Additionally, we apply these methods to estimate the ERC between long-term PM<sub>2.5</sub> exposure and all-cause mortality using data from over 68 million Medicare beneficiaries in the United States. Our simulation indicates that regression methods not placed within a causal inference framework are unsuitable when anticipating heterogeneous exposure effects. Under the setting of a large sample size and unknown ERC functional form, we recommend utilizing causal inference methods that allow for nonlinear ERCs. In our data application, we observe a nonlinear relationship between annual average PM<sub>2.5</sub> and all-cause mortality in the Medicare population, with a sharp increase in relative mortality at low PM<sub>2.5</sub> concentrations. Our findings suggest that stricter limits on PM<sub>2.5</sub> could avert numerous premature deaths. To facilitate the utilization of our results, we provide publicly available, reproducible code on Github for every step of the analysis.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12433667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145071090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating infectious disease forecasts with allocation scoring rules. 用分配计分规则评价传染病预报。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2024-12-18 DOI: 10.1093/jrsssa/qnae136
Aaron Gerding, Nicholas G Reich, Benjamin Rogers, Evan L Ray

Recent years have seen increasing efforts to forecast infectious disease burdens, with a primary goal being to help public health workers make informed policy decisions. However, there has been only limited discussion of how predominant forecast evaluation metrics might indicate the success of policies based in part on those forecasts. We explore one possible tether between forecasts and policy: the allocation of limited medical resources so as to minimize unmet need. We use probabilistic forecasts of disease burden in each of several regions to determine optimal resource allocations, and then we score forecasts according to how much unmet need their associated allocations would have allowed. We illustrate with forecasts of COVID-19 hospitalizations in the U.S., and we find that the forecast skill ranking given by this allocation scoring rule can vary substantially from the ranking given by the weighted interval score. We see this as evidence that the allocation scoring rule detects forecast value that is missed by traditional accuracy measures and that the general strategy of designing scoring rules that are directly linked to policy performance is a promising direction for epidemic forecast evaluation.

近年来,人们加大了预测传染病负担的努力,其主要目标是帮助公共卫生工作者做出知情的政策决定。然而,对于主要的预测评估指标如何表明部分基于这些预测的政策的成功,只有有限的讨论。我们探讨了预测与政策之间的一个可能的联系:分配有限的医疗资源,以尽量减少未满足的需求。我们利用对每个地区疾病负担的概率预测来确定最佳资源分配,然后根据其相关分配允许的未满足需求的多少对预测进行评分。我们以美国COVID-19住院预测为例进行说明,我们发现该分配评分规则给出的预测技能排名与加权区间评分给出的排名存在很大差异。我们认为这证明了分配计分规则能够检测到传统精度度量所遗漏的预测值,设计与政策绩效直接相关的计分规则的一般策略是流行病预测评估的一个有希望的方向。
{"title":"Evaluating infectious disease forecasts with allocation scoring rules.","authors":"Aaron Gerding, Nicholas G Reich, Benjamin Rogers, Evan L Ray","doi":"10.1093/jrsssa/qnae136","DOIUrl":"10.1093/jrsssa/qnae136","url":null,"abstract":"<p><p>Recent years have seen increasing efforts to forecast infectious disease burdens, with a primary goal being to help public health workers make informed policy decisions. However, there has been only limited discussion of how predominant forecast evaluation metrics might indicate the success of policies based in part on those forecasts. We explore one possible tether between forecasts and policy: the allocation of limited medical resources so as to minimize unmet need. We use probabilistic forecasts of disease burden in each of several regions to determine optimal resource allocations, and then we score forecasts according to how much unmet need their associated allocations would have allowed. We illustrate with forecasts of COVID-19 hospitalizations in the U.S., and we find that the forecast skill ranking given by this allocation scoring rule can vary substantially from the ranking given by the weighted interval score. We see this as evidence that the allocation scoring rule detects forecast value that is missed by traditional accuracy measures and that the general strategy of designing scoring rules that are directly linked to policy performance is a promising direction for epidemic forecast evaluation.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12371526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal quasi-experimental methods for rare disease outcomes: the impact of reformulated gasoline on childhood haematologic cancer. 罕见疾病结果的时空准实验方法:重新配方汽油对儿童血液病癌症的影响。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2024-11-18 eCollection Date: 2025-10-01 DOI: 10.1093/jrsssa/qnae109
Sofia L Vega, Rachel C Nethery

Although some pollutants emitted in vehicle exhaust, such as benzene, are known to cause leukaemia in adults with high exposure levels, less is known about the relationship between traffic-related air pollution (TRAP) and childhood haematologic cancer. In the 1990s, the US EPA enacted the reformulated gasoline program in select areas of the U.S., which drastically reduced ambient TRAP in affected areas. This created an ideal quasi-experiment to study the effects of TRAP on childhood haematologic cancers. However, existing methods for quasi-experimental analyses can perform poorly when outcomes are rare and unstable, as with childhood cancer incidence. We develop Bayesian spatio-temporal matrix completion methods to conduct causal inference in quasi-experimental settings with rare outcomes. Selective information sharing across space and time enables stable estimation, and the Bayesian approach facilitates uncertainty quantification. We evaluate the methods through simulations and apply them to estimate the causal effects of TRAP on childhood leukaemia and lymphoma.

虽然人们知道汽车尾气中排放的一些污染物,如苯,会导致高暴露水平的成年人患上白血病,但人们对交通相关空气污染(TRAP)与儿童血液病癌症之间的关系知之甚少。在20世纪90年代,美国环保署在美国的一些地区颁布了重新配方的汽油计划,这大大减少了受影响地区的环境陷阱。这创造了一个理想的准实验来研究TRAP对儿童血液病癌症的影响。然而,现有的准实验分析方法在结果罕见且不稳定的情况下表现不佳,例如儿童癌症发病率。我们开发贝叶斯时空矩阵补全方法,在具有罕见结果的准实验设置中进行因果推理。跨空间和时间的选择性信息共享实现了稳定估计,贝叶斯方法促进了不确定性量化。我们通过模拟来评估这些方法,并应用它们来估计TRAP对儿童白血病和淋巴瘤的因果效应。
{"title":"Spatio-temporal quasi-experimental methods for rare disease outcomes: the impact of reformulated gasoline on childhood haematologic cancer.","authors":"Sofia L Vega, Rachel C Nethery","doi":"10.1093/jrsssa/qnae109","DOIUrl":"10.1093/jrsssa/qnae109","url":null,"abstract":"<p><p>Although some pollutants emitted in vehicle exhaust, such as benzene, are known to cause leukaemia in adults with high exposure levels, less is known about the relationship between traffic-related air pollution (TRAP) and childhood haematologic cancer. In the 1990s, the US EPA enacted the reformulated gasoline program in select areas of the U.S., which drastically reduced ambient TRAP in affected areas. This created an ideal quasi-experiment to study the effects of TRAP on childhood haematologic cancers. However, existing methods for quasi-experimental analyses can perform poorly when outcomes are rare and unstable, as with childhood cancer incidence. We develop Bayesian spatio-temporal matrix completion methods to conduct causal inference in quasi-experimental settings with rare outcomes. Selective information sharing across space and time enables stable estimation, and the Bayesian approach facilitates uncertainty quantification. We evaluate the methods through simulations and apply them to estimate the causal effects of TRAP on childhood leukaemia and lymphoma.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 4","pages":"1184-1202"},"PeriodicalIF":1.6,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503115/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studying Chinese immigrants' spatial distribution in the Raleigh-Durham area by linking survey and commercial data using romanized names. 结合调查数据和商业数据,研究罗利-达勒姆地区华人移民的空间分布。
IF 1.6 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Pub Date : 2024-10-23 eCollection Date: 2025-01-01 DOI: 10.1093/jrsssa/qnae107
Eric A Bai, Botao Ju, Madeleine Beckner, Jerome P Reiter, M Giovanna Merli, Ted Mouw

Many population surveys do not provide information on respondents' residential addresses, instead offering coarse geographies like zip code or higher aggregations. However, fine resolution geography can be beneficial for characterizing neighbourhoods, especially for relatively rare populations such as immigrants. One way to obtain such information is to link survey records to records in auxiliary databases that include residential addresses by matching on variables common to both files. We present an approach based on probabilistic record linkage that enables matching survey participants in the Chinese Immigrants in Raleigh-Durham Study to records from InfoUSA, an information provider of residential records. The two files use different Chinese name romanization practices, which we address through a novel and generalizable strategy for constructing records' pairwise comparison vectors for romanized names. Using a fully Bayesian record linkage model, we characterize the geospatial distribution of Chinese immigrants in the Raleigh-Durham area of North Carolina.

许多人口调查不提供受访者的居住地址信息,而是提供诸如邮政编码或更高的集合等粗略的地理位置信息。然而,精细分辨率的地理可以有利于表征社区,特别是相对罕见的人口,如移民。获得此类信息的一种方法是通过匹配两个文件的共同变量,将调查记录与包括居住地址在内的辅助数据库中的记录联系起来。我们提出了一种基于概率记录链接的方法,使罗利-达勒姆中国移民研究中的调查参与者能够与居住记录信息提供商InfoUSA的记录相匹配。这两个文件使用了不同的中文姓名罗马化做法,我们通过一种新颖的、可推广的策略来构建记录的罗马化姓名的两两比较向量来解决这个问题。利用全贝叶斯记录联系模型,研究了北卡罗莱纳州罗利-达勒姆地区中国移民的地理空间分布特征。
{"title":"Studying Chinese immigrants' spatial distribution in the Raleigh-Durham area by linking survey and commercial data using romanized names.","authors":"Eric A Bai, Botao Ju, Madeleine Beckner, Jerome P Reiter, M Giovanna Merli, Ted Mouw","doi":"10.1093/jrsssa/qnae107","DOIUrl":"10.1093/jrsssa/qnae107","url":null,"abstract":"<p><p>Many population surveys do not provide information on respondents' residential addresses, instead offering coarse geographies like zip code or higher aggregations. However, fine resolution geography can be beneficial for characterizing neighbourhoods, especially for relatively rare populations such as immigrants. One way to obtain such information is to link survey records to records in auxiliary databases that include residential addresses by matching on variables common to both files. We present an approach based on probabilistic record linkage that enables matching survey participants in the Chinese Immigrants in Raleigh-Durham Study to records from InfoUSA, an information provider of residential records. The two files use different Chinese name romanization practices, which we address through a novel and generalizable strategy for constructing records' pairwise comparison vectors for romanized names. Using a fully Bayesian record linkage model, we characterize the geospatial distribution of Chinese immigrants in the Raleigh-Durham area of North Carolina.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 1","pages":"84-97"},"PeriodicalIF":1.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11728054/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142985303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Royal Statistical Society Series A-Statistics in Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1