首页 > 最新文献

Epidemiology最新文献

英文 中文
Adapting Back-calculation Methods to Estimate the Incidence of Tuberculosis. 采用反算方法估计肺结核发病率。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2025-12-05 DOI: 10.1097/EDE.0000000000001936
Anne N Shapiro, Shariq Mohammed, C Robert Horsburgh, Helen E Jenkins, Laura F White

Background: Despite being the leading cause of death, the global tuberculosis (TB) burden is ill-defined. Existing methods to estimate incidence are time and/or resource-intensive and often inaccurate. Back-calculation was developed to estimate HIV incidence by considering reported cases to be a convolution of the disease duration and the incidence of new cases. New estimates of TB natural history parameters allow us to develop Bayesian back-calculation methods for TB to assign case notification data to the time point of onset of disease.

Methods: Recorded counts of TB cases are underestimates of the true burden of disease, so we include a multiplier derived from prevalence to notification ratios to account for underreporting. We assume a Poisson distribution for notifications and incidence and use a penalized-likelihood before smooth estimates. We estimate sex-stratified TB incidence for Vietnam, Cambodia, and the Philippines via Markov chain Monte Carlo.

Results: Annual estimated TB incidence was, on average 19% greater than recorded notifications. TB incidence among males was on average 3.8% higher than females in Vietnam, 1.3% in Cambodia, and 2.5% higher in the Philippines.

Conclusions: These estimates account for the delay between bacteriologically positive subclinical disease and notification and, as such, may be more temporally accurate than existing methods.

背景:尽管结核病是导致死亡的主要原因,但全球结核病负担的定义并不明确。现有的估计发病率的方法需要耗费大量时间和/或资源,而且往往不准确。通过考虑报告病例是疾病持续时间和新病例发生率的卷积,开发了反向计算来估计艾滋病毒发病率。对结核病自然历史参数的新估计使我们能够开发结核病的贝叶斯反计算方法,将病例报告数据分配到疾病发病的时间点。方法:记录的结核病病例数低估了真正的疾病负担,因此我们纳入了由患病率与通报比率得出的乘数,以解释漏报。我们假设通知和发生率为泊松分布,并在平滑估计之前使用惩罚似然。我们通过马尔科夫链蒙特卡洛估计越南、柬埔寨和菲律宾按性别分层的结核病发病率。结果:每年估计的结核病发病率平均比记录的报告高19%。在越南,男性结核病发病率平均比女性高3.8%,在柬埔寨高1.3%,在菲律宾高2.5%。结论:这些估计解释了细菌学阳性亚临床疾病与通报之间的延迟,因此可能比现有方法在时间上更准确。
{"title":"Adapting Back-calculation Methods to Estimate the Incidence of Tuberculosis.","authors":"Anne N Shapiro, Shariq Mohammed, C Robert Horsburgh, Helen E Jenkins, Laura F White","doi":"10.1097/EDE.0000000000001936","DOIUrl":"10.1097/EDE.0000000000001936","url":null,"abstract":"<p><strong>Background: </strong>Despite being the leading cause of death, the global tuberculosis (TB) burden is ill-defined. Existing methods to estimate incidence are time and/or resource-intensive and often inaccurate. Back-calculation was developed to estimate HIV incidence by considering reported cases to be a convolution of the disease duration and the incidence of new cases. New estimates of TB natural history parameters allow us to develop Bayesian back-calculation methods for TB to assign case notification data to the time point of onset of disease.</p><p><strong>Methods: </strong>Recorded counts of TB cases are underestimates of the true burden of disease, so we include a multiplier derived from prevalence to notification ratios to account for underreporting. We assume a Poisson distribution for notifications and incidence and use a penalized-likelihood before smooth estimates. We estimate sex-stratified TB incidence for Vietnam, Cambodia, and the Philippines via Markov chain Monte Carlo.</p><p><strong>Results: </strong>Annual estimated TB incidence was, on average 19% greater than recorded notifications. TB incidence among males was on average 3.8% higher than females in Vietnam, 1.3% in Cambodia, and 2.5% higher in the Philippines.</p><p><strong>Conclusions: </strong>These estimates account for the delay between bacteriologically positive subclinical disease and notification and, as such, may be more temporally accurate than existing methods.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"220-227"},"PeriodicalIF":4.4,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12851544/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145892508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Considerations for Estimating Causal Effects of Informatively Timed Treatments. 估计信息定时治疗的因果效应的考虑。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2026-01-29 DOI: 10.1097/EDE.0000000000001943
Arman Oganisian

Epidemiologic studies are often concerned with estimating causal effects of a sequence of treatment decisions on survival outcomes. In many settings, treatment decisions do not occur at fixed, prespecified follow-up times. Rather, timing varies across subjects in ways that may be informative of subsequent treatment decisions and potential outcomes. Awareness of the issue and potential solutions is lacking in the literature, which motivates this work. Here, we formalize the issue of informative timing, problems associated with ignoring it, and show how g-methods can be used to analyze sequential treatments that are informatively timed. As we describe, in such settings, the waiting times between successive treatment decisions may be properly viewed as time-varying confounders. Using synthetic examples, we illustrate how g-methods that do not adjust for these waiting times may be biased and how adjustment can be done in scenarios where patients may die or be censored in between treatments. Finally, we provide implementation guidance and examples using publicly available software. Our concluding message is that (1) considering timing is important for valid inference and (2) correcting for informative timing can be done with g-methods that adjust for waiting times between treatments as time-varying confounders.

流行病学研究通常关注于估计一系列治疗决定对生存结果的因果影响。在许多情况下,治疗决定并不发生在固定的、预先规定的随访时间。相反,时间在不同的受试者之间有所不同,这可能会为后续的治疗决策和潜在的结果提供信息。文献中缺乏对问题和潜在解决方案的认识,这促使了这项工作。在这里,我们形式化了信息性定时的问题,以及与忽略它相关的问题,并展示了如何使用g方法来分析信息性定时的顺序处理。正如我们所描述的,在这种情况下,连续治疗决定之间的等待时间可以适当地视为时变混杂因素。使用综合示例,我们说明了不调整这些等待时间的g方法如何可能有偏差,以及如何在患者可能死亡或在治疗之间被审查的情况下进行调整。最后,我们提供了使用公开软件的实现指南和示例。我们的结论是:(1)考虑时间对于有效推断很重要;(2)校正信息时间可以通过g方法来完成,该方法可以调整治疗之间的等待时间作为时变混杂因素。
{"title":"Considerations for Estimating Causal Effects of Informatively Timed Treatments.","authors":"Arman Oganisian","doi":"10.1097/EDE.0000000000001943","DOIUrl":"10.1097/EDE.0000000000001943","url":null,"abstract":"<p><p>Epidemiologic studies are often concerned with estimating causal effects of a sequence of treatment decisions on survival outcomes. In many settings, treatment decisions do not occur at fixed, prespecified follow-up times. Rather, timing varies across subjects in ways that may be informative of subsequent treatment decisions and potential outcomes. Awareness of the issue and potential solutions is lacking in the literature, which motivates this work. Here, we formalize the issue of informative timing, problems associated with ignoring it, and show how g-methods can be used to analyze sequential treatments that are informatively timed. As we describe, in such settings, the waiting times between successive treatment decisions may be properly viewed as time-varying confounders. Using synthetic examples, we illustrate how g-methods that do not adjust for these waiting times may be biased and how adjustment can be done in scenarios where patients may die or be censored in between treatments. Finally, we provide implementation guidance and examples using publicly available software. Our concluding message is that (1) considering timing is important for valid inference and (2) correcting for informative timing can be done with g-methods that adjust for waiting times between treatments as time-varying confounders.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"166-176"},"PeriodicalIF":4.4,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12851548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145932817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Historical Neighborhood Redlining and Fertility in a Cohort of US Black Women. 美国黑人妇女群体的历史街区划分和生育率。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2026-01-29 DOI: 10.1097/EDE.0000000000001942
Mary D Willis, Chen Sheng, Sharonda M Lovett, Talia Feldscher, Kendra D Sims, Brittney Francis, Jacqueline M Hicks, Etienne X Holder, Lauren A Wise, Yvette C Cozier, Amelia K Wesselink

Background: Structural racism can manifest in contemporary neighborhoods via historical policies or programs. For example, the Home Owners' Loan Corporation, a government-backed program from the 1930s, systematically diverted wealth away from Black neighborhoods. The reproductive health consequences of this racist program remain understudied. We evaluated associations between residence in a historically redlined neighborhood and fecundability, the per-cycle probability of conception.

Methods: We used data from the Black Women's Health Study, a US cohort of Black women who were aged 21-69 years in 1995 and were followed up with biannual questionnaires. Experiences of infertility (i.e., tried for ≥12 months to conceive without success) were captured on several questionnaires. A 2011 supplemental module collected pregnancy histories between 1995 and 2011, including planning status and time to conception. We linked geocoded addresses to historical Home Owners' Loan Corporation grades (A ["best"] to D ["hazardous," i.e., redlined]). Using proportional probabilities regression models with generalized estimating equations, we estimated fecundability ratios and 95% confidence intervals (CIs).

Results: Our analysis included 818 planned pregnancy attempts from 674 participants (mean age = 33.9 years). Relative to participants residing in neighborhoods with the highest grades (A or B), adjusted models showed reduced fecundability among participants who resided in lower graded neighborhoods (C: 0.91, 95% CI: 0.77, 1.09; D: 0.82, 95% CI: 0.68, 0.99).

Conclusions: In this cohort of US Black women, contemporary residence in a historically redlined neighborhood was associated with reduced fecundability. Our findings highlight the importance of exploring how historical neighborhood disinvestment affects reproductive health.

背景:结构性种族主义可以通过历史政策或项目在当代社区中表现出来。例如,房主贷款公司(HOLC),一个政府支持的项目,从20世纪30年代开始,系统地转移了黑人社区的财富。这一种族主义项目对生殖健康的影响仍未得到充分研究。我们评估了居住在历史上被划红线的社区和生育能力之间的关系,即每个周期受孕的概率。方法:我们使用来自黑人妇女健康研究(BWHS)的数据,这是一个美国黑人妇女队列,年龄为21-69岁,1995年为21-69岁,每两年随访一次问卷。不孕症的经历(即,尝试怀孕≥12个月而未成功)被记录在几份问卷中。2011年的补充模块收集了1995年至2011年间的怀孕史,包括计划状态和受孕时间。我们将地理编码地址与历史HOLC等级(A[“最佳”]到D[“危险”,即红线])联系起来。使用比例概率回归模型和广义估计方程,我们估计了可育率和95%置信区间(CI)。结果:我们的分析包括674名参与者(平均年龄= 33.9岁)的818次计划怀孕尝试。相对于居住在最高等级社区(A或B)的参与者,调整后的模型显示,居住在较低等级社区的参与者的生育能力降低(C: 0.91, 95% CI: 0.77, 1.09; D: 0.82, 95% CI: 0.68, 0.99)。结论:在这组美国黑人女性中,当代居住在历史上被划定为红线的社区与生育能力降低有关。我们的研究结果强调了探索历史街区撤资如何影响生殖健康的重要性。
{"title":"Historical Neighborhood Redlining and Fertility in a Cohort of US Black Women.","authors":"Mary D Willis, Chen Sheng, Sharonda M Lovett, Talia Feldscher, Kendra D Sims, Brittney Francis, Jacqueline M Hicks, Etienne X Holder, Lauren A Wise, Yvette C Cozier, Amelia K Wesselink","doi":"10.1097/EDE.0000000000001942","DOIUrl":"10.1097/EDE.0000000000001942","url":null,"abstract":"<p><strong>Background: </strong>Structural racism can manifest in contemporary neighborhoods via historical policies or programs. For example, the Home Owners' Loan Corporation, a government-backed program from the 1930s, systematically diverted wealth away from Black neighborhoods. The reproductive health consequences of this racist program remain understudied. We evaluated associations between residence in a historically redlined neighborhood and fecundability, the per-cycle probability of conception.</p><p><strong>Methods: </strong>We used data from the Black Women's Health Study, a US cohort of Black women who were aged 21-69 years in 1995 and were followed up with biannual questionnaires. Experiences of infertility (i.e., tried for ≥12 months to conceive without success) were captured on several questionnaires. A 2011 supplemental module collected pregnancy histories between 1995 and 2011, including planning status and time to conception. We linked geocoded addresses to historical Home Owners' Loan Corporation grades (A [\"best\"] to D [\"hazardous,\" i.e., redlined]). Using proportional probabilities regression models with generalized estimating equations, we estimated fecundability ratios and 95% confidence intervals (CIs).</p><p><strong>Results: </strong>Our analysis included 818 planned pregnancy attempts from 674 participants (mean age = 33.9 years). Relative to participants residing in neighborhoods with the highest grades (A or B), adjusted models showed reduced fecundability among participants who resided in lower graded neighborhoods (C: 0.91, 95% CI: 0.77, 1.09; D: 0.82, 95% CI: 0.68, 0.99).</p><p><strong>Conclusions: </strong>In this cohort of US Black women, contemporary residence in a historically redlined neighborhood was associated with reduced fecundability. Our findings highlight the importance of exploring how historical neighborhood disinvestment affects reproductive health.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"268-277"},"PeriodicalIF":4.4,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12765556/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145761603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
State minimum wages and food insecurity among households receiving government food assistance. 国家最低工资和接受政府粮食援助的家庭的粮食不安全。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-30 DOI: 10.1097/EDE.0000000000001955
Krista Neumann, Barbara A Laraia, Corinne A Riddell

Background: While the Supplemental Nutrition Assistance Program (SNAP) aims to reduce food insecurity among low-income households, nearly half of recipients remain food insecure. Increasing state minimum wages could help improve food security, but because SNAP benefits are income-dependent, net effects are unclear.

Methods: Using the U.S. Current Population Survey Food Security Supplement (2002-2019), we linked households interviewed in two consecutive Decembers to create a two-year panel. The primary sample included SNAP recipient households with at least one adult working in year 1. The exposure was the average effective state minimum wage (2019 dollars), for each state and year. We estimated prevalence differences (PD) in food insecurity per $1 increase in minimum wage using a within-household linear fixed-effects model adjusting for time-varying economic confounders and concurrent safety-net policies. We investigated variation in the effect by household and family structure, race/ethnicity, and educational attainment using stratified models.

Results: Overall estimates were most compatible with protective effects (PD per 10,000 households: -298, 95% CI: -673, 77). The strongest protective estimates were for senior-headed (-1,472, 95% CI: -2,869, -76), Hispanic (-865, 95% CI: -1,638, -92), and some-college households (-988, 95% CI: -1,664, -312). Estimates for Indigenous households were imprecise and possibly harmful (900, 95% CI: -736, 2,537). Most other subgroup estimates were near zero.

Conclusions: Increased minimum wages may modestly support food security for working SNAP households. As SNAP benefit rules evolve, these findings suggest that minimum-wage policies can complement and reinforce the program's goals to protect low-income households from food hardship.

背景:虽然补充营养援助计划(SNAP)旨在减少低收入家庭的粮食不安全状况,但近一半的受助人仍处于粮食不安全状态。提高州最低工资可能有助于改善食品安全,但由于SNAP福利依赖于收入,净影响尚不清楚。方法:使用美国当前人口调查食品安全补充(2002-2019),我们将连续两个12月接受采访的家庭联系起来,创建了一个为期两年的小组。主要样本包括至少有一名成年人在第一年工作的SNAP受助者家庭。敞口是每个州和年份的平均有效州最低工资(2019年美元)。我们使用家庭内部线性固定效应模型,对时变经济混杂因素和同时的安全网政策进行了调整,估计了最低工资每增加1美元,粮食不安全的患病率差异(PD)。我们使用分层模型调查了家庭和家庭结构、种族/民族和教育程度的影响差异。结果:总体估计值与保护效果最相符(每10,000户PD: -298, 95% CI: - 673,77)。最强的保护性估计是老年人(-1,472,95% CI: -2,869, -76),西班牙裔(-865,95% CI: -1,638, -92)和一些大学家庭(-988,95% CI: -1,664, -312)。对土著家庭的估计不准确,可能有害(900,95% CI: - 736,2,537)。大多数其他亚组的估计值接近于零。结论:提高最低工资可能会适度地支持有工作的SNAP家庭的粮食安全。随着SNAP福利规则的演变,这些研究结果表明,最低工资政策可以补充和加强该计划保护低收入家庭免受粮食困难的目标。
{"title":"State minimum wages and food insecurity among households receiving government food assistance.","authors":"Krista Neumann, Barbara A Laraia, Corinne A Riddell","doi":"10.1097/EDE.0000000000001955","DOIUrl":"https://doi.org/10.1097/EDE.0000000000001955","url":null,"abstract":"<p><strong>Background: </strong>While the Supplemental Nutrition Assistance Program (SNAP) aims to reduce food insecurity among low-income households, nearly half of recipients remain food insecure. Increasing state minimum wages could help improve food security, but because SNAP benefits are income-dependent, net effects are unclear.</p><p><strong>Methods: </strong>Using the U.S. Current Population Survey Food Security Supplement (2002-2019), we linked households interviewed in two consecutive Decembers to create a two-year panel. The primary sample included SNAP recipient households with at least one adult working in year 1. The exposure was the average effective state minimum wage (2019 dollars), for each state and year. We estimated prevalence differences (PD) in food insecurity per $1 increase in minimum wage using a within-household linear fixed-effects model adjusting for time-varying economic confounders and concurrent safety-net policies. We investigated variation in the effect by household and family structure, race/ethnicity, and educational attainment using stratified models.</p><p><strong>Results: </strong>Overall estimates were most compatible with protective effects (PD per 10,000 households: -298, 95% CI: -673, 77). The strongest protective estimates were for senior-headed (-1,472, 95% CI: -2,869, -76), Hispanic (-865, 95% CI: -1,638, -92), and some-college households (-988, 95% CI: -1,664, -312). Estimates for Indigenous households were imprecise and possibly harmful (900, 95% CI: -736, 2,537). Most other subgroup estimates were near zero.</p><p><strong>Conclusions: </strong>Increased minimum wages may modestly support food security for working SNAP households. As SNAP benefit rules evolve, these findings suggest that minimum-wage policies can complement and reinforce the program's goals to protect low-income households from food hardship.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146156576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of gestational age estimation algorithms for non-live births in administrative healthcare databases. 开发和验证的胎龄估计算法的非活产行政卫生保健数据库。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-30 DOI: 10.1097/EDE.0000000000001956
Yongtai Cho, Eun-Young Choi, Hyesung Lee, Yunha Noh, Jung Yeol Han, Seung-Ah Choe, Hoon Kim, Ju-Young Shin

Background: Algorithms to estimate gestational age (GA) for non-live births have been developed in other administrative healthcare databases, but their applicability in Korea remains unknown. We adapted algorithms developed in the United States and evaluated their validity in Korean healthcare claims data.

Methods: Using the National Health Information Database (NHID) of South Korea, we linked GA information at influenza vaccination from the national vaccination registry to establish a reference standard. Non-live births were stratified into spontaneous/induced abortions and stillbirths. Four algorithms were tested: (1) assigning outcome-specific GA, (2) adjusting GA based on ultrasound scan records, (3) a regression model using gestational markers as predictors, and (4) a random forest model. Algorithms were evaluated by the proportion of estimates within 1-4 weeks of the reference standard and the mean squared error (MSE). External validation was conducted using an independent dataset.

Results: Random forests performed best for both spontaneous/induced abortions (MSE 1.68 weeks²) and stillbirths (MSE 0.97 weeks²), with 92.6% (95% CI 91.6-93.4) and 97.4% (96.2-98.3) of predictions falling within two weeks of the reference standard, respectively. However, in the external validation set, the ultrasound record-based adjustment approach performed similarly to the random forest approach for both spontaneous/induced abortions (MSE 8.37 vs. 8.15 weeks²) and stillbirths (MSE 12.42 vs. 12.52 weeks²).

Conclusions: Deterministic approaches may be preferable for estimating GA of non-live births in the NHID, as they are simpler to implement and perform comparably to model-based algorithms. These algorithms can support pregnancy research in the Korean population.

背景:估计非活产胎龄(GA)的算法已经在其他行政保健数据库中开发出来,但它们在韩国的适用性仍然未知。我们采用了在美国开发的算法,并评估了它们在韩国医疗保健索赔数据中的有效性。方法:利用韩国国家卫生信息数据库(NHID),我们将流感疫苗接种的GA信息与国家疫苗接种登记处联系起来,以建立参考标准。非活产分为自然/人工流产和死产。测试了四种算法:(1)分配结果特异性遗传算法,(2)基于超声扫描记录调整遗传算法,(3)使用妊娠标记物作为预测因子的回归模型,以及(4)随机森林模型。通过参考标准1-4周内的估计值比例和均方误差(MSE)对算法进行评估。使用独立数据集进行外部验证。结果:随机森林对自然流产/人工流产(MSE 1.68周²)和死产(MSE 0.97周²)的预测效果最好,分别有92.6% (95% CI 91.6-93.4)和97.4%(96.2-98.3)的预测落在参考标准的两周内。然而,在外部验证集中,基于超声记录的调整方法在自然流产/人工流产(MSE 8.37 vs. 8.15周²)和死产(MSE 12.42 vs. 12.52周²)方面的表现与随机森林方法相似。结论:与基于模型的算法相比,确定性方法更容易实现和执行,因此可能更适合估算NHID中非活产婴儿的遗传风险。这些算法可以支持韩国人口的怀孕研究。
{"title":"Development and validation of gestational age estimation algorithms for non-live births in administrative healthcare databases.","authors":"Yongtai Cho, Eun-Young Choi, Hyesung Lee, Yunha Noh, Jung Yeol Han, Seung-Ah Choe, Hoon Kim, Ju-Young Shin","doi":"10.1097/EDE.0000000000001956","DOIUrl":"https://doi.org/10.1097/EDE.0000000000001956","url":null,"abstract":"<p><strong>Background: </strong>Algorithms to estimate gestational age (GA) for non-live births have been developed in other administrative healthcare databases, but their applicability in Korea remains unknown. We adapted algorithms developed in the United States and evaluated their validity in Korean healthcare claims data.</p><p><strong>Methods: </strong>Using the National Health Information Database (NHID) of South Korea, we linked GA information at influenza vaccination from the national vaccination registry to establish a reference standard. Non-live births were stratified into spontaneous/induced abortions and stillbirths. Four algorithms were tested: (1) assigning outcome-specific GA, (2) adjusting GA based on ultrasound scan records, (3) a regression model using gestational markers as predictors, and (4) a random forest model. Algorithms were evaluated by the proportion of estimates within 1-4 weeks of the reference standard and the mean squared error (MSE). External validation was conducted using an independent dataset.</p><p><strong>Results: </strong>Random forests performed best for both spontaneous/induced abortions (MSE 1.68 weeks²) and stillbirths (MSE 0.97 weeks²), with 92.6% (95% CI 91.6-93.4) and 97.4% (96.2-98.3) of predictions falling within two weeks of the reference standard, respectively. However, in the external validation set, the ultrasound record-based adjustment approach performed similarly to the random forest approach for both spontaneous/induced abortions (MSE 8.37 vs. 8.15 weeks²) and stillbirths (MSE 12.42 vs. 12.52 weeks²).</p><p><strong>Conclusions: </strong>Deterministic approaches may be preferable for estimating GA of non-live births in the NHID, as they are simpler to implement and perform comparably to model-based algorithms. These algorithms can support pregnancy research in the Korean population.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146156611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mind the gap: addressing missing person time when estimating outcome incidence in longitudinal data. 注意差距:在估计纵向数据的结果发生率时处理失踪人员时间。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-30 DOI: 10.1097/EDE.0000000000001958
Jacqueline E Rudolph, Rachael K Ross, Lauren C Zalla, Shruti H Mehta, Gregory D Kirk, Becky L Genberg, Bryan Lau, Catherine R Lesko

Background: Longitudinal data often include gaps in observation when outcomes (and other variables) are unmeasured, due to missed study visits or drop out. We explore the fundamentals of data gaps and use simulation to compare approaches for handling data gaps when estimating outcome incidence.

Methods: We generated a simulation of 1000 individuals across 10 study visits. We used 4 data generating mechanisms: (1) missingness was independent of the outcome; (2) there was a baseline common cause of missingness and the outcome; (3) there was a time-varying common cause; and (4) the outcome directly affected future missingness. We estimated the risk and rate of the first outcome occurrence (generated as a transient; repeated; and permanent outcome), using crude and adjusted approaches, across 1000 iterations and compared bias and empirical standard error.

Results: Under Scenario 1, in crude analyses, results were unbiased when censoring prior to a data gap but not when allowing participants to return. Under Scenarios 2-4, all crude approaches were biased. Inverse probability of censoring weights and multiple imputation were relatively unbiased across scenarios and outcome types; multiple imputation was more precise. Inverse probability of observation weights were biased when the outcome was permanent and were less precise than either of the other two approaches.

Conclusions: Crude approaches allowing participants to return following a data gap are not recommended because they can be biased even when missingness and the outcome are independent. Instead, one should either censor or handle the data gap using multiple imputation.

背景:由于错过研究访问或退出,当结果(和其他变量)无法测量时,纵向数据通常包括观察空白。我们探索数据缺口的基本原理,并使用模拟来比较在估计结果发生率时处理数据缺口的方法。方法:我们在10次研究访问中对1000人进行了模拟。我们使用了4种数据生成机制:(1)缺失与结果无关;(2)缺失的共同原因和结果有一个基线;(3)存在时变的共同原因;(4)结果直接影响未来缺失。我们通过1000次迭代,使用粗糙和调整的方法,估计了第一个结果发生的风险和率(作为短暂、重复和永久结果产生),并比较了偏差和经验标准误差。结果:在情景1下,在粗分析中,在数据缺口之前进行审查时结果是无偏的,但在允许参与者返回时则不是。在情景2-4下,所有粗糙的方法都是有偏差的。在不同的情景和结果类型中,审查权重和多重归算的逆概率相对无偏;多重插值更精确。当结果是永久性的,并且比其他两种方法中的任何一种都不精确时,观察权重的逆概率是有偏差的。结论:不推荐允许参与者在数据缺口后返回的粗糙方法,因为即使缺失和结果是独立的,它们也可能有偏见。相反,应该使用多重输入来审查或处理数据差距。
{"title":"Mind the gap: addressing missing person time when estimating outcome incidence in longitudinal data.","authors":"Jacqueline E Rudolph, Rachael K Ross, Lauren C Zalla, Shruti H Mehta, Gregory D Kirk, Becky L Genberg, Bryan Lau, Catherine R Lesko","doi":"10.1097/EDE.0000000000001958","DOIUrl":"https://doi.org/10.1097/EDE.0000000000001958","url":null,"abstract":"<p><strong>Background: </strong>Longitudinal data often include gaps in observation when outcomes (and other variables) are unmeasured, due to missed study visits or drop out. We explore the fundamentals of data gaps and use simulation to compare approaches for handling data gaps when estimating outcome incidence.</p><p><strong>Methods: </strong>We generated a simulation of 1000 individuals across 10 study visits. We used 4 data generating mechanisms: (1) missingness was independent of the outcome; (2) there was a baseline common cause of missingness and the outcome; (3) there was a time-varying common cause; and (4) the outcome directly affected future missingness. We estimated the risk and rate of the first outcome occurrence (generated as a transient; repeated; and permanent outcome), using crude and adjusted approaches, across 1000 iterations and compared bias and empirical standard error.</p><p><strong>Results: </strong>Under Scenario 1, in crude analyses, results were unbiased when censoring prior to a data gap but not when allowing participants to return. Under Scenarios 2-4, all crude approaches were biased. Inverse probability of censoring weights and multiple imputation were relatively unbiased across scenarios and outcome types; multiple imputation was more precise. Inverse probability of observation weights were biased when the outcome was permanent and were less precise than either of the other two approaches.</p><p><strong>Conclusions: </strong>Crude approaches allowing participants to return following a data gap are not recommended because they can be biased even when missingness and the outcome are independent. Instead, one should either censor or handle the data gap using multiple imputation.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146156618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-term cardiovascular outcomes following bariatric surgery: Reconciling seemingly conflicting evidence. 减肥手术后的长期心血管结果:调和看似矛盾的证据。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-21 DOI: 10.1097/EDE.0000000000001952
Sebastien Haneuse, Luke Benz, Valerie A Smith, David Arterburn, Matthew L Maceijewski

Substantial observational evidence exists in support of bariatric surgery being associated with reduction in risk for a wide range of outcomes, including cardiovascular disease (CVD) in patients with diabetes. Two recent studies, however, argued that much of that prior work suffers from various sources of underappreciated bias as well as design decisions that compromise whether one can conceive of a corresponding target trial. Furthermore, results based on analyses of claims data from Optum and electronic health record data from the Veterans Administration (VA) are presented as providing evidence of no CVD benefit for bariatric surgery in patients with diabetes. In this paper, we use data from a prior Kaiser Permanente study to emulate a trial that mimics the methods employed in the VA study. This new analysis finds a reduction in risk of CVD in patients with diabetes, consistent with pre-existing evidence. We discuss possible mechanisms by which the discrepant results can be reconciled, including issues of statistical validity that arise from small samples, whether recent work on transportability indicates that we should not always expect results to always be concordant and the role of conservatism associated with "clinical trial thinking". We conclude with a discussion of what standards should be used when considering the work of others in the literature and the role that evidence triangulation may be play in the future.

大量观察性证据支持减肥手术可降低包括糖尿病患者心血管疾病(CVD)在内的多种预后风险。然而,最近的两项研究认为,之前的大部分工作都受到各种未被充分认识的偏见的影响,以及设计决策,这些决策损害了人们是否可以设想出相应的目标试验。此外,基于Optum的索赔数据和退伍军人管理局(VA)的电子健康记录数据的分析结果提供了糖尿病患者减肥手术没有CVD益处的证据。在本文中,我们使用先前Kaiser Permanente研究的数据来模拟一项模仿VA研究中采用的方法的试验。这项新的分析发现,糖尿病患者患心血管疾病的风险降低,与先前的证据一致。我们讨论了调和差异结果的可能机制,包括由小样本引起的统计有效性问题,最近关于可移植性的工作是否表明我们不应该总是期望结果总是一致的,以及与“临床试验思维”相关的保守主义的作用。最后,我们讨论了在考虑文献中其他人的工作时应该使用什么标准,以及证据三角测量在未来可能发挥的作用。
{"title":"Long-term cardiovascular outcomes following bariatric surgery: Reconciling seemingly conflicting evidence.","authors":"Sebastien Haneuse, Luke Benz, Valerie A Smith, David Arterburn, Matthew L Maceijewski","doi":"10.1097/EDE.0000000000001952","DOIUrl":"https://doi.org/10.1097/EDE.0000000000001952","url":null,"abstract":"<p><p>Substantial observational evidence exists in support of bariatric surgery being associated with reduction in risk for a wide range of outcomes, including cardiovascular disease (CVD) in patients with diabetes. Two recent studies, however, argued that much of that prior work suffers from various sources of underappreciated bias as well as design decisions that compromise whether one can conceive of a corresponding target trial. Furthermore, results based on analyses of claims data from Optum and electronic health record data from the Veterans Administration (VA) are presented as providing evidence of no CVD benefit for bariatric surgery in patients with diabetes. In this paper, we use data from a prior Kaiser Permanente study to emulate a trial that mimics the methods employed in the VA study. This new analysis finds a reduction in risk of CVD in patients with diabetes, consistent with pre-existing evidence. We discuss possible mechanisms by which the discrepant results can be reconciled, including issues of statistical validity that arise from small samples, whether recent work on transportability indicates that we should not always expect results to always be concordant and the role of conservatism associated with \"clinical trial thinking\". We conclude with a discussion of what standards should be used when considering the work of others in the literature and the role that evidence triangulation may be play in the future.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146028736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Defining and Estimating Outcomes Directly Averted by a Vaccination Program when Rollout Occurs Over Time. 定义和估计随着时间的推移,疫苗接种计划直接避免的结果。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-19 DOI: 10.1097/EDE.0000000000001953
Katherine M Jia, Christopher B Boyer, Alyssa Bilinski, Marc Lipsitch

During the COVID-19 pandemic, estimating the total deaths averted by vaccination was of great public health interest. Instead of estimating total deaths averted by vaccination among both vaccinated and unvaccinated individuals, some studies empirically estimated only "directly averted" deaths among vaccinated individuals, typically suggesting that vaccines prevented more deaths among unvaccinated and vaccinated individuals than directly among vaccinated individuals only, due to the indirect effect. Here, we define the causal estimand to quantify outcomes "directly averted" by vaccination-i.e., the impact of vaccination for vaccinated individuals, holding vaccination coverage fixed-for vaccination at multiple time points, which is a lower bound on the total outcomes averted when the indirect effect is non-negative. We develop an unbiased estimator for the causal estimand in a one-stage randomized controlled trial (RCT) and explore the bias of a popular "hazard difference" estimator frequently used in empirical studies. We show that even in an RCT, the hazard difference estimator is biased if vaccination has a non-null effect, as it fails to incorporate the greater depletion of susceptibles among the unvaccinated individuals. In simulations, the overestimation is small for averted deaths when infection-fatality rate is low, as for many important pathogens. However, the overestimation can be large for averted infections given a high basic reproduction number and a high vaccine efficacy against infection. Additionally, we define and compare estimand and estimators for avertible outcomes (i.e., outcomes that could have been averted by vaccination, but were not due to failure to vaccinate). Future studies can explore the identifiability of the causal estimand in observational settings.

在2019冠状病毒病大流行期间,估计通过接种疫苗避免的死亡总数具有重大的公共卫生意义。一些研究没有估计接种疫苗和未接种疫苗的人因接种疫苗而避免的总死亡人数,而是只根据经验估计了接种疫苗的人“直接避免”的死亡人数,这通常表明,由于间接效应,疫苗在未接种疫苗和接种疫苗的人中预防的死亡人数多于直接接种疫苗的人。在这里,我们定义了因果估计,并量化了通过接种疫苗“直接避免”的结果。,在多个时间点保持疫苗接种覆盖率固定的情况下,疫苗接种对接种个体的影响,这是间接效应非负作用时避免的总结果的下界。我们在一项单阶段随机对照试验(RCT)中建立了因果估计的无偏估计量,并探讨了在实证研究中经常使用的流行的“风险差异”估计量的偏倚。我们表明,即使在随机对照试验中,如果疫苗接种具有非零效应,则风险差异估计器是有偏差的,因为它未能纳入未接种疫苗个体中易感人群的更大消耗。在模拟中,当感染致死率较低时,对避免死亡的高估很小,就像对许多重要病原体一样。然而,由于基本繁殖数高和疫苗抗感染效力高,对避免感染的高估可能很大。此外,我们定义并比较了可避免结果的估计值和估计值(即,可以通过接种疫苗避免的结果,但不是由于未能接种疫苗)。未来的研究可以探索在观察环境中因果估计的可识别性。
{"title":"Defining and Estimating Outcomes Directly Averted by a Vaccination Program when Rollout Occurs Over Time.","authors":"Katherine M Jia, Christopher B Boyer, Alyssa Bilinski, Marc Lipsitch","doi":"10.1097/EDE.0000000000001953","DOIUrl":"10.1097/EDE.0000000000001953","url":null,"abstract":"<p><p>During the COVID-19 pandemic, estimating the total deaths averted by vaccination was of great public health interest. Instead of estimating total deaths averted by vaccination among both vaccinated and unvaccinated individuals, some studies empirically estimated only \"directly averted\" deaths among vaccinated individuals, typically suggesting that vaccines prevented more deaths among unvaccinated and vaccinated individuals than directly among vaccinated individuals only, due to the indirect effect. Here, we define the causal estimand to quantify outcomes \"directly averted\" by vaccination-i.e., the impact of vaccination for vaccinated individuals, holding vaccination coverage fixed-for vaccination at multiple time points, which is a lower bound on the total outcomes averted when the indirect effect is non-negative. We develop an unbiased estimator for the causal estimand in a one-stage randomized controlled trial (RCT) and explore the bias of a popular \"hazard difference\" estimator frequently used in empirical studies. We show that even in an RCT, the hazard difference estimator is biased if vaccination has a non-null effect, as it fails to incorporate the greater depletion of susceptibles among the unvaccinated individuals. In simulations, the overestimation is small for averted deaths when infection-fatality rate is low, as for many important pathogens. However, the overestimation can be large for averted infections given a high basic reproduction number and a high vaccine efficacy against infection. Additionally, we define and compare estimand and estimators for avertible outcomes (i.e., outcomes that could have been averted by vaccination, but were not due to failure to vaccinate). Future studies can explore the identifiability of the causal estimand in observational settings.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146028594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differential Reporting of Severe Maternal Morbidity on US Birth Certificate and Claims Data by Race and Ethnicity. 美国出生证明和索赔数据中严重孕产妇发病率差异报告的种族和民族。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-19 DOI: 10.1097/EDE.0000000000001954
Beth L Pineles, Anthony D Harris, Lisa Pineles, Esa M Davis, K S Joseph, Enrique Schisterman, Laurence S Magder, Katherine E Goodman

Objective: To compare reporting of severe maternal morbidity (SMM) in birth certificate versus hospital claims data and evaluate differences by patient race/ethnicity.

Methods: We compared incidence rates of blood transfusion, hysterectomy, intensive care unit admission, uterine rupture, and 3rd/4th degree perineal laceration between 2019 deliveries in the US birth certificate and the Premier Healthcare Database, overall and stratifying by maternal race/ethnicity. We then computed incidence rate ratios (IRRs) computed between datasets, and fit logistic regression models of race/ethnicity on SMM.

Results: Comparing 3,467,934 birth certificate deliveries with 3,450,569 Premier deliveries (n=905,766 pre-weighting for national representativeness), incidence rates of SMMs were lower in birth certificate compared with Premier data, and these rate differentials varied by maternal race/ethnicity. For example, among non-Hispanic white patients, the incidence rate of blood transfusions in birth certificate data was 50% that of the incidence rate in the Premier claims dataset (IRR: 0.50, 95% CI: 0.47, 0.52). Among all other races/ethnicities, the incidence rate of blood transfusions was even lower relative to the claims data (IRR range: 0.29-0.39). Adjusted odds ratios (aOR) for SMM in non-Hispanic Black and Hispanic patients versus non-Hispanic white patients were closer to the null in birth certificate than Premier data (e.g., compared with non-Hispanic white patients, non-Hispanic Black patients had a 16% higher adjusted odds in the birth certificate (95% CI: 1.10, 1.21) data versus an 84% higher adjusted odds of blood transfusion in Premier data (95% CI: 1.79, 1.89)).

Conclusions: Birth certificates report substantially less SMM than claims data, with greater differential in reporting for non-Hispanic Black and Hispanic patients that may bias birth certificate-based research findings.

目的:比较出生证明中严重产妇发病率(SMM)的报告与医院索赔数据,并评估患者种族/民族的差异。方法:我们比较了美国出生证明和Premier Healthcare数据库中2019年分娩时输血、子宫切除术、重症监护病房入院、子宫破裂和3 /4度会阴撕裂伤的发生率,并按产妇种族/民族进行了总体和分层。然后,我们计算了数据集之间的发病率比(IRRs),并拟合了SMM上种族/民族的logistic回归模型。结果:将3,467,934例出生证明分娩与3,450,569例Premier分娩(n=905,766个国家代表性的预加权)进行比较,出生证明中SMMs的发病率低于Premier数据,并且这些发病率差异因母亲种族/民族而异。例如,在非西班牙裔白人患者中,出生证明数据中的输血发生率是Premier索赔数据集中发生率的50% (IRR: 0.50, 95% CI: 0.47, 0.52)。在所有其他种族/民族中,输血发生率甚至比索赔数据更低(IRR范围:0.29-0.39)。与Premier数据相比,非西班牙裔黑人和西班牙裔患者与非西班牙裔白人患者的SMM校正比值比(aOR)在出生证明中更接近零(例如,与非西班牙裔白人患者相比,非西班牙裔黑人患者在出生证明中的校正比值高16% (95% CI: 1.10, 1.21),而在Premier数据中输血校正比值高84% (95% CI: 1.79, 1.89))。结论:出生证明报告的SMM比索赔数据少得多,非西班牙裔黑人和西班牙裔患者报告的差异更大,这可能会影响基于出生证明的研究结果。
{"title":"Differential Reporting of Severe Maternal Morbidity on US Birth Certificate and Claims Data by Race and Ethnicity.","authors":"Beth L Pineles, Anthony D Harris, Lisa Pineles, Esa M Davis, K S Joseph, Enrique Schisterman, Laurence S Magder, Katherine E Goodman","doi":"10.1097/EDE.0000000000001954","DOIUrl":"https://doi.org/10.1097/EDE.0000000000001954","url":null,"abstract":"<p><strong>Objective: </strong>To compare reporting of severe maternal morbidity (SMM) in birth certificate versus hospital claims data and evaluate differences by patient race/ethnicity.</p><p><strong>Methods: </strong>We compared incidence rates of blood transfusion, hysterectomy, intensive care unit admission, uterine rupture, and 3rd/4th degree perineal laceration between 2019 deliveries in the US birth certificate and the Premier Healthcare Database, overall and stratifying by maternal race/ethnicity. We then computed incidence rate ratios (IRRs) computed between datasets, and fit logistic regression models of race/ethnicity on SMM.</p><p><strong>Results: </strong>Comparing 3,467,934 birth certificate deliveries with 3,450,569 Premier deliveries (n=905,766 pre-weighting for national representativeness), incidence rates of SMMs were lower in birth certificate compared with Premier data, and these rate differentials varied by maternal race/ethnicity. For example, among non-Hispanic white patients, the incidence rate of blood transfusions in birth certificate data was 50% that of the incidence rate in the Premier claims dataset (IRR: 0.50, 95% CI: 0.47, 0.52). Among all other races/ethnicities, the incidence rate of blood transfusions was even lower relative to the claims data (IRR range: 0.29-0.39). Adjusted odds ratios (aOR) for SMM in non-Hispanic Black and Hispanic patients versus non-Hispanic white patients were closer to the null in birth certificate than Premier data (e.g., compared with non-Hispanic white patients, non-Hispanic Black patients had a 16% higher adjusted odds in the birth certificate (95% CI: 1.10, 1.21) data versus an 84% higher adjusted odds of blood transfusion in Premier data (95% CI: 1.79, 1.89)).</p><p><strong>Conclusions: </strong>Birth certificates report substantially less SMM than claims data, with greater differential in reporting for non-Hispanic Black and Hispanic patients that may bias birth certificate-based research findings.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":""},"PeriodicalIF":4.4,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146028683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sensitivity of Cancer Registry Linkage with Missing or Incomplete Social Security Number and Implications for Cancer Cohorts. 社会安全号码缺失或不完整的癌症登记联系的敏感性及其对癌症队列的影响。
IF 4.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-01-01 Epub Date: 2025-09-09 DOI: 10.1097/EDE.0000000000001913
Lauren E McCullough, Anusila Deka, Christina Newton, Peter Briggs, Erin Gardner, Kevin C Ward, Lauren R Teras, Alpa V Patel

Background: Linking cancer cohort participants to state cancer registries typically relies on personally identifiable information, including social security numbers (SSN), which uniquely identify individuals. However, complete SSN collection can be limited due to privacy concerns. This study evaluates the sensitivity of cancer registry linkage using partial or missing SSN and examines differences by demographic characteristics.

Methods: Using data from 284,361 participants in the Cancer Prevention Study-3, we conducted probabilistic linkages with cancer registries in Georgia, Ohio, and Texas using Match*Pro software. Participants were linked using combinations of personally identifiable information: complete SSN, partial SSN (last four digits), and missing SSN. We compared the sensitivity of linkages before and after manual review and stratified by sex, age, and race-ethnicity.

Results: Before manual review, the sensitivity for missing and partial SSNs was 92.5%. Sensitivity improved to 98.6% for missing SSN and 98.8% for partial SSN after manual review. We observed no notable heterogeneity by sex, age, or race-ethnicity, with sensitivity exceeding 87% across all subgroups. Manual review substantially reduced uncertain matches, contributing to high linkage accuracy.

Conclusion: This study demonstrates that high sensitivity in cancer registry linkage can be achieved without a complete SSN, provided other personally identifiable information (e.g., name, date of birth, longitudinal address) is available. These findings support the feasibility of accurate cancer case identification in cohorts with limited SSN data, particularly for historically marginalized populations, and underscore the importance of designing inclusive population-based cancer studies.

背景:将癌症队列参与者与州癌症登记处联系起来通常依赖于个人身份信息,包括社会安全号码(SSN),这是唯一标识个人的信息。但是,由于隐私问题,完整的SSN收集可能会受到限制。本研究评估了使用部分或缺失社会安全号的癌症登记联系的敏感性,并检查了人口统计学特征的差异。方法:使用来自癌症预防研究-3 (CPS-3)的284,361名参与者的数据,我们使用Match*Pro软件与乔治亚州、俄亥俄州和德克萨斯州的癌症登记处进行了概率关联。参与者通过个人身份信息的组合联系在一起:完整的社会保障号、部分社会保障号(最后四位数字)和缺失的社会保障号。我们比较了人工评估前后联系的敏感性,并按性别、年龄和种族进行了分层。结果:人工复核前,对SSN缺失和部分的敏感性为92.5%。在人工检查后,对缺失SSN的敏感度提高到98.6%,对部分SSN的敏感度提高到98.8%。我们没有观察到性别、年龄或种族的显著异质性,所有亚组的敏感性均超过87%。手动审查大大减少了不确定的匹配,有助于高联动精度。讨论:本研究表明,在没有完整SSN的情况下,提供其他个人身份信息(如姓名、出生日期、纵向地址),可以实现癌症登记链接的高灵敏度。这些发现支持了在SSN数据有限的队列中准确识别癌症病例的可行性,特别是对于历史上边缘化的人群,并强调了设计包容性的基于人群的癌症研究的重要性。
{"title":"Sensitivity of Cancer Registry Linkage with Missing or Incomplete Social Security Number and Implications for Cancer Cohorts.","authors":"Lauren E McCullough, Anusila Deka, Christina Newton, Peter Briggs, Erin Gardner, Kevin C Ward, Lauren R Teras, Alpa V Patel","doi":"10.1097/EDE.0000000000001913","DOIUrl":"10.1097/EDE.0000000000001913","url":null,"abstract":"<p><strong>Background: </strong>Linking cancer cohort participants to state cancer registries typically relies on personally identifiable information, including social security numbers (SSN), which uniquely identify individuals. However, complete SSN collection can be limited due to privacy concerns. This study evaluates the sensitivity of cancer registry linkage using partial or missing SSN and examines differences by demographic characteristics.</p><p><strong>Methods: </strong>Using data from 284,361 participants in the Cancer Prevention Study-3, we conducted probabilistic linkages with cancer registries in Georgia, Ohio, and Texas using Match*Pro software. Participants were linked using combinations of personally identifiable information: complete SSN, partial SSN (last four digits), and missing SSN. We compared the sensitivity of linkages before and after manual review and stratified by sex, age, and race-ethnicity.</p><p><strong>Results: </strong>Before manual review, the sensitivity for missing and partial SSNs was 92.5%. Sensitivity improved to 98.6% for missing SSN and 98.8% for partial SSN after manual review. We observed no notable heterogeneity by sex, age, or race-ethnicity, with sensitivity exceeding 87% across all subgroups. Manual review substantially reduced uncertain matches, contributing to high linkage accuracy.</p><p><strong>Conclusion: </strong>This study demonstrates that high sensitivity in cancer registry linkage can be achieved without a complete SSN, provided other personally identifiable information (e.g., name, date of birth, longitudinal address) is available. These findings support the feasibility of accurate cancer case identification in cohorts with limited SSN data, particularly for historically marginalized populations, and underscore the importance of designing inclusive population-based cancer studies.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"73-76"},"PeriodicalIF":4.4,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145023023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Epidemiology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1