Isabelle Niedhammer, Hélène Sultan-Taïeb, Yamna Taouk, Anthony D LaMontagne
In a recent paper, Ghoroubi et al. (Am J Epidemiol 2025 Jan 8;194(1):302-310) used the indirect attributable fraction (AF) method to provide estimates of fractions of all-cause mortality attributable to work-related factors. This commentary discusses the limitations and potential of this paper and provides insights and guidance to make optimal use of indirect AF estimation in occupational epidemiology. The crucial steps are the choice of the datasets and input data related to the prevalence of exposure and relative risk (RR), requiring comparability of time period, population characteristics, and the definition and measurement of exposure. Published systematic literature reviews with meta-analyses are essential or, if not available, conducting meta-analyses to provide estimates of RR. Finally, it is important to verify the assumptions for the chosen AF formula including evidence of causality, consideration of confounding and (in)dependence between exposures when several exposures are studied at the same time. We conclude by suggesting that the paper by Ghoroubi et al. may have provided a proof of concept for 1 work-related factor only, but considerable additional research will be required to represent work-related factors overall.
{"title":"To what extent can attributable fractions in occupational epidemiology be estimated in the absence of key data?","authors":"Isabelle Niedhammer, Hélène Sultan-Taïeb, Yamna Taouk, Anthony D LaMontagne","doi":"10.1093/aje/kwaf188","DOIUrl":"10.1093/aje/kwaf188","url":null,"abstract":"<p><p>In a recent paper, Ghoroubi et al. (Am J Epidemiol 2025 Jan 8;194(1):302-310) used the indirect attributable fraction (AF) method to provide estimates of fractions of all-cause mortality attributable to work-related factors. This commentary discusses the limitations and potential of this paper and provides insights and guidance to make optimal use of indirect AF estimation in occupational epidemiology. The crucial steps are the choice of the datasets and input data related to the prevalence of exposure and relative risk (RR), requiring comparability of time period, population characteristics, and the definition and measurement of exposure. Published systematic literature reviews with meta-analyses are essential or, if not available, conducting meta-analyses to provide estimates of RR. Finally, it is important to verify the assumptions for the chosen AF formula including evidence of causality, consideration of confounding and (in)dependence between exposures when several exposures are studied at the same time. We conclude by suggesting that the paper by Ghoroubi et al. may have provided a proof of concept for 1 work-related factor only, but considerable additional research will be required to represent work-related factors overall.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"557-561"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144938765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evidence demonstrating the beneficial effects of improved air quality on lipid health is scarce. This study addresses this gap by examining whether reducing PM2.5 exposure can decrease the risk of dyslipidemia. We conducted a longitudinal quasi-experimental study using the Taiwan MJ and Hong Kong MJ cohorts from 2000 to 2018. A total of 8808 adults with consistently high PM2.5 exposure (≥ 25 μg/m3) were paired with 4612 adults whose PM2.5 exposure decreased from high to low levels (< 25 μg/m3) using propensity score matching. Cox regression models with time-dependent covariates were used to analyze the associations between PM2.5 reduction and the risk of dyslipidemia, as well as individual lipid abnormalities. We found that participants with reducing PM2.5 exposure had a significantly lower risk of dyslipidemia compared to their counterparts (hazard ratio [HR], 0.75; 95% confidence interval [CI], 0.68-0.84). Nonlinear concentration-response relationships were observed. Similar associations were found for elevated TC (HR, 0.61; 95% CI, 0.51-0.74) and LDL-C (HR, 0.69; 95% CI, 0.57-0.84), and decreased HDL-C (HR, 0.59; 95% CI, 0.47-0.75). Reducing PM2.5 exposure significantly lowers the risk of dyslipidemia and improves lipid profiles, providing direct evidence of the health benefits associated with air quality improvement.
{"title":"Reducing PM2.5 exposure lowers dyslipidemia risk: a longitudinal quasi-experimental study.","authors":"Dezhong Chen, Yiyue Yin, Dongmei Yu, Ling Zhang, Weiyi Chen, Jian Xu, Ting Xiao, Hung Chak Ho, G Neil Thomas, Yu Huang, Xiang Qian Lao","doi":"10.1093/aje/kwaf192","DOIUrl":"10.1093/aje/kwaf192","url":null,"abstract":"<p><p>Evidence demonstrating the beneficial effects of improved air quality on lipid health is scarce. This study addresses this gap by examining whether reducing PM2.5 exposure can decrease the risk of dyslipidemia. We conducted a longitudinal quasi-experimental study using the Taiwan MJ and Hong Kong MJ cohorts from 2000 to 2018. A total of 8808 adults with consistently high PM2.5 exposure (≥ 25 μg/m3) were paired with 4612 adults whose PM2.5 exposure decreased from high to low levels (< 25 μg/m3) using propensity score matching. Cox regression models with time-dependent covariates were used to analyze the associations between PM2.5 reduction and the risk of dyslipidemia, as well as individual lipid abnormalities. We found that participants with reducing PM2.5 exposure had a significantly lower risk of dyslipidemia compared to their counterparts (hazard ratio [HR], 0.75; 95% confidence interval [CI], 0.68-0.84). Nonlinear concentration-response relationships were observed. Similar associations were found for elevated TC (HR, 0.61; 95% CI, 0.51-0.74) and LDL-C (HR, 0.69; 95% CI, 0.57-0.84), and decreased HDL-C (HR, 0.59; 95% CI, 0.47-0.75). Reducing PM2.5 exposure significantly lowers the risk of dyslipidemia and improves lipid profiles, providing direct evidence of the health benefits associated with air quality improvement.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"524-532"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144938753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen Zhou, Lorelei A Mucci, Mingyang Song, Hongbing Shen, Christopher I Amos
Mendelian randomization can reveal the etiological association between body mass index (BMI) and lung cancer. However, the associations between the trajectories of BMI and the risk of lung cancer remain inconclusive. We employed growth mixture modeling to identify trajectories of pre-diagnostic BMI in 163 545 individuals (117 445 women from the Nurses' Health Study and 46 100 men from the Health Professionals Follow-Up Study). We assessed the associations between BMI trajectories and lung cancer risk, as well as the effects within subgroups. Four trajectories were identified: normal-moderate increasing (class 1), overweight-marked increasing (class 2), overweight-obese turning (class 3), and obese-persistent (class 4). We observed a decreased risk of lung cancer in class 2 (adjusted hazard ratio [aHR], 0.53; 95% CI, 0.38-0.75; P = 2.32 ×10-4) and class 3 (aHR, 0.67; 95% CI, 0.48-0.94; P = .022). In stratification analysis, we observed that the effects of class 4 on lung cancer risk vary among histological subtypes. Additionally, within the class 1 population, the top quintile of BMI also demonstrated different effects among histological subtypes. Increasing lifetime BMI was associated with a decreased risk of lung cancer, with this association varying by histological subtypes, indicating histology-specific mechanisms in lung carcinogenesis.
孟德尔随机化可以揭示身体质量指数(BMI)与肺癌之间的病因学关联。然而,BMI的轨迹和肺癌风险之间的联系仍然没有定论。我们采用生长混合模型来确定163,545人的诊断前BMI轨迹(来自护士健康研究的117,445名女性和来自卫生专业人员随访研究的46,100名男性)。我们评估了BMI轨迹与肺癌风险之间的关系,以及亚组内的影响。确定了四种轨迹:正常-中度增加(第1类),超重显著增加(第2类),超重肥胖转向(第3类)和持续肥胖(第4类)。我们观察到2级(校正风险比[aHR] = 0.53, 95%可信区间[CI] = 0.38-0.75, P = 2.32×10-4)和3级(aHR = 0.67, 95% CI = 0.48-0.94, P = 0.022)的肺癌风险降低。在分层分析中,我们观察到4级对肺癌风险的影响因组织学亚型而异。此外,在1类人群中,BMI的前五分之一也表现出不同组织学亚型的影响。终生BMI增加与肺癌风险降低相关,这种关联因组织学亚型而异,表明肺癌发生的组织学特异性机制。
{"title":"Pre-diagnostic body mass index trajectories and associations with lung cancer risk.","authors":"Wen Zhou, Lorelei A Mucci, Mingyang Song, Hongbing Shen, Christopher I Amos","doi":"10.1093/aje/kwaf084","DOIUrl":"10.1093/aje/kwaf084","url":null,"abstract":"<p><p>Mendelian randomization can reveal the etiological association between body mass index (BMI) and lung cancer. However, the associations between the trajectories of BMI and the risk of lung cancer remain inconclusive. We employed growth mixture modeling to identify trajectories of pre-diagnostic BMI in 163 545 individuals (117 445 women from the Nurses' Health Study and 46 100 men from the Health Professionals Follow-Up Study). We assessed the associations between BMI trajectories and lung cancer risk, as well as the effects within subgroups. Four trajectories were identified: normal-moderate increasing (class 1), overweight-marked increasing (class 2), overweight-obese turning (class 3), and obese-persistent (class 4). We observed a decreased risk of lung cancer in class 2 (adjusted hazard ratio [aHR], 0.53; 95% CI, 0.38-0.75; P = 2.32 ×10-4) and class 3 (aHR, 0.67; 95% CI, 0.48-0.94; P = .022). In stratification analysis, we observed that the effects of class 4 on lung cancer risk vary among histological subtypes. Additionally, within the class 1 population, the top quintile of BMI also demonstrated different effects among histological subtypes. Increasing lifetime BMI was associated with a decreased risk of lung cancer, with this association varying by histological subtypes, indicating histology-specific mechanisms in lung carcinogenesis.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"358-366"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145197930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Madley-Dowd, Rachael A Hughes, Maya B Mathur, Jon Heron, Kate Tilling
Missing data are a pervasive problem in epidemiology, with multiple imputation (MI) a commonly used analysis method. MI is valid when data are missing at random (MAR). However, definitions of MAR with multiple incomplete variables are not easily interpretable and descriptions of graphical model-based conditions are not accessible to applied researchers. Previous literature shows that MI may be valid in subsamples, even if not in the full dataset. Practical guidance on applying MI with multiple incomplete variables is lacking. We present an algorithm using directed acyclic graphs to determine when MI will estimate an exposure-outcome coefficient without bias. We extend the algorithm to assess whether MI in a subsample of the data, in which some variables are complete, and the remaining are imputed, will be valid and unbiased for the exposure-outcome coefficient. We apply the algorithm to several simple exemplars, and in a more complex real-life example highlight that only subsample-MI of the outcome would be valid. Our algorithm provides researchers with the tools to decide whether to use MI in practice when there are multiple incomplete variables. Further work could focus on the likely size and direction of biases and the impact of different missing data patterns.
{"title":"Using directed acyclic graphs to determine whether multiple imputation or subsample-multiple imputation estimates of an exposure-outcome association are unbiased.","authors":"Paul Madley-Dowd, Rachael A Hughes, Maya B Mathur, Jon Heron, Kate Tilling","doi":"10.1093/aje/kwaf265","DOIUrl":"10.1093/aje/kwaf265","url":null,"abstract":"<p><p>Missing data are a pervasive problem in epidemiology, with multiple imputation (MI) a commonly used analysis method. MI is valid when data are missing at random (MAR). However, definitions of MAR with multiple incomplete variables are not easily interpretable and descriptions of graphical model-based conditions are not accessible to applied researchers. Previous literature shows that MI may be valid in subsamples, even if not in the full dataset. Practical guidance on applying MI with multiple incomplete variables is lacking. We present an algorithm using directed acyclic graphs to determine when MI will estimate an exposure-outcome coefficient without bias. We extend the algorithm to assess whether MI in a subsample of the data, in which some variables are complete, and the remaining are imputed, will be valid and unbiased for the exposure-outcome coefficient. We apply the algorithm to several simple exemplars, and in a more complex real-life example highlight that only subsample-MI of the outcome would be valid. Our algorithm provides researchers with the tools to decide whether to use MI in practice when there are multiple incomplete variables. Further work could focus on the likely size and direction of biases and the impact of different missing data patterns.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"505-514"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145601799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel A Harris, Adam D'Amico, Hemalkumar B Mehta, Lori A Daiello, Sarah D Berry, Charles E Leonard, Yu-Chia Hsu, Douglas Kiel, Kaleen N Hayes, Melissa Riester, Jimmie E Roberts, Laura Reich, Peyton Free, Andrew R Zullo
Nursing home (NH) residents are an important population for pharmacoepidemiologic research due to their prevalence of multimorbidity and polypharmacy. Medicare claims are commonly used to study medication use in this population, but medications dispensed during hospitalizations or post-acute care are unobservable due to bundled payment structures. We developed algorithms to identify NH days when medication dispensings can be observed in claims. Using a cohort of NH residents in the United States from 2013 to 2020, we linked Medicare fee-for-service (FFS) claims with Minimum Data Set clinical assessments. NH days were classified as "observable medication use time" if residents were enrolled in Medicare parts A, B, and D were not receiving post-acute care and were not hospitalized. Among 12.3 million NH residents and 2.7 billion NH days, 1.1 billion days (72.4% of Medicare-enrolled days and 39.6% of all NH days) were identified as observable medication use time. Within the first 100 days of NH admission, 27.3% of days were medication-observable, increasing to 89.4% after 100 days. On average, we identified 68% more person-time, and 51% more residents, compared to standard 100-day definitions for "long-stay" NH residents. Our algorithms enhance researchers' ability to measure medication exposure time, improving the validity of pharmacoepidemiologic studies.
{"title":"Identifying observable medication use time in administrative databases: a tutorial using nursing home residents.","authors":"Daniel A Harris, Adam D'Amico, Hemalkumar B Mehta, Lori A Daiello, Sarah D Berry, Charles E Leonard, Yu-Chia Hsu, Douglas Kiel, Kaleen N Hayes, Melissa Riester, Jimmie E Roberts, Laura Reich, Peyton Free, Andrew R Zullo","doi":"10.1093/aje/kwaf227","DOIUrl":"10.1093/aje/kwaf227","url":null,"abstract":"<p><p>Nursing home (NH) residents are an important population for pharmacoepidemiologic research due to their prevalence of multimorbidity and polypharmacy. Medicare claims are commonly used to study medication use in this population, but medications dispensed during hospitalizations or post-acute care are unobservable due to bundled payment structures. We developed algorithms to identify NH days when medication dispensings can be observed in claims. Using a cohort of NH residents in the United States from 2013 to 2020, we linked Medicare fee-for-service (FFS) claims with Minimum Data Set clinical assessments. NH days were classified as \"observable medication use time\" if residents were enrolled in Medicare parts A, B, and D were not receiving post-acute care and were not hospitalized. Among 12.3 million NH residents and 2.7 billion NH days, 1.1 billion days (72.4% of Medicare-enrolled days and 39.6% of all NH days) were identified as observable medication use time. Within the first 100 days of NH admission, 27.3% of days were medication-observable, increasing to 89.4% after 100 days. On average, we identified 68% more person-time, and 51% more residents, compared to standard 100-day definitions for \"long-stay\" NH residents. Our algorithms enhance researchers' ability to measure medication exposure time, improving the validity of pharmacoepidemiologic studies.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"587-595"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12809372/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145257184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lee Smith, Guillermo F López Sánchez, Masoud Rahmati, Pinar Soysal, Mark A Tully, Yvonne Barnett, Laurie Butler, Dong Keon Yon, Soeun Kim, Helen Keyes, Nicola Veronese, Hans Oh, Karel Kostev, Louis Jacob, Jae Il Shin, Ai Koyanagi
We investigated the association between unclean cooking fuel use and sleep problems in a nationally representative sample of adults aged ≥65 years from 6 low- and middle-income countries (China, Ghana, India, Mexico, Russia, and South Africa). Cross-sectional, community-based data from the WHO Study on global AGEing and adult health (SAGE) were analyzed. Unclean cooking fuel referred to kerosene/paraffin, coal/charcoal, wood, agriculture/crop, animal dung, and shrubs/grass. Outcomes related to sleep included self-reported nocturnal sleep problems, lethargy, poor sleep quality, and sleep duration. Multivariable logistic regression analysis was conducted. Data on 14 585 individuals aged ≥65 years were analyzed (mean [SD] age: 72.6 [11.5] years; 55.0% females). After adjustment for potential confounders, unclean cooking fuel use was associated with significant 1.51 (95% CI, 1.03-2.22) times higher odds for nocturnal sleep problems, while it was also associated with 1.64 (95% CI, 1.20-2.26) times higher odds for long sleep duration (ie, >9 vs >6 to 9 h), but not with other sleep-related outcomes. These findings suggest that the implementation of the United Nations Sustainable Development Goal 7, which advocates affordable, reliable, sustainable, and modern energy for all, may also have a positive impact on sleep problems, as well as a plethora of other health and environmental impacts. This article is part of a Special Collection on Cross-National Gerontology.
{"title":"Unclean cooking fuel use and sleep problems among adults 65 years and older from 6 countries.","authors":"Lee Smith, Guillermo F López Sánchez, Masoud Rahmati, Pinar Soysal, Mark A Tully, Yvonne Barnett, Laurie Butler, Dong Keon Yon, Soeun Kim, Helen Keyes, Nicola Veronese, Hans Oh, Karel Kostev, Louis Jacob, Jae Il Shin, Ai Koyanagi","doi":"10.1093/aje/kwaf022","DOIUrl":"10.1093/aje/kwaf022","url":null,"abstract":"<p><p>We investigated the association between unclean cooking fuel use and sleep problems in a nationally representative sample of adults aged ≥65 years from 6 low- and middle-income countries (China, Ghana, India, Mexico, Russia, and South Africa). Cross-sectional, community-based data from the WHO Study on global AGEing and adult health (SAGE) were analyzed. Unclean cooking fuel referred to kerosene/paraffin, coal/charcoal, wood, agriculture/crop, animal dung, and shrubs/grass. Outcomes related to sleep included self-reported nocturnal sleep problems, lethargy, poor sleep quality, and sleep duration. Multivariable logistic regression analysis was conducted. Data on 14 585 individuals aged ≥65 years were analyzed (mean [SD] age: 72.6 [11.5] years; 55.0% females). After adjustment for potential confounders, unclean cooking fuel use was associated with significant 1.51 (95% CI, 1.03-2.22) times higher odds for nocturnal sleep problems, while it was also associated with 1.64 (95% CI, 1.20-2.26) times higher odds for long sleep duration (ie, >9 vs >6 to 9 h), but not with other sleep-related outcomes. These findings suggest that the implementation of the United Nations Sustainable Development Goal 7, which advocates affordable, reliable, sustainable, and modern energy for all, may also have a positive impact on sleep problems, as well as a plethora of other health and environmental impacts. This article is part of a Special Collection on Cross-National Gerontology.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"391-397"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143363204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mia Charifson, Geidily Beaton-Mata, Robyn Lipschultz, India Robinson, Simone A Sasse, Hye-Chun Hur, Shilpi-Mehta S Lee, Erinn M Hade, Linda G Kahn
Electronic health records (EHRs) present opportunities to study uterine fibroids and endometriosis within diverse populations. When using EHR data, it is important to validate outcome classification via diagnosis codes. We performed a validation study of 3 approaches ([1] International Classification of Diseases-10 (ICD-10) code alone, [2] ICD-10 code + diagnostic procedure, and [3] ICD-10 code + all diagnostic information) to identify incident uterine fibroids and endometriosis patients among n = 750 NYU Langone Health 2016-2023. Chart review was used to determine the true diagnosis status. When using a binary classification system (incident vs nonincident patient), Approaches 2 and 3 had higher positive predictive values (PPVs) for uterine fibroids (0.86 and 0.87 vs 0.78) and for endometriosis (0.70 and 0.73 vs 0.66), but Approach 1 outperformed the other 2 in negative predictive values (NPVs) for both outcomes. When using a 3-level classification system (incident vs prevalent vs disease-free patients), PPV for prevalent patients was low for all approaches, while PPV/NPV of disease-free patients was generally above 0.8. Using ICD-10 codes alone yielded higher NPVs but resulted in lower PPVs compared with the other approaches. Continued validation of uterine fibroids/endometriosis EHR studies is warranted to increase research into these understudied gynecologic conditions.
电子健康记录(EHRs)为研究不同人群的子宫肌瘤和子宫内膜异位症提供了机会。当使用电子病历数据时,通过诊断代码验证结果分类是很重要的。我们对三种方法(1:单独使用ICD-10代码,2:ICD-10代码+诊断程序,3:ICD-10代码+所有诊断信息)进行了验证研究,以识别n=750名NYU Langone Health 2016-2023年的子宫肌瘤和子宫内膜异位症患者。采用图表复习来确定真实的诊断状态。当使用二元分类系统(事件与非事件患者)时,方法2和3对子宫肌瘤(0.86和0.87 vs. 0.78)和子宫内膜异位症(0.70和0.73 vs. 0.66)具有更高的阳性预测值(ppv),但方法1在两种结果的阴性预测值(npv)上都优于其他两种。当使用三级分类系统(发病、流行、无病患者)时,所有方法中流行患者的PPV都较低,而无病患者的PPV/NPV一般在0.8以上。与其他方法相比,单独使用ICD-10编码产生更高的npv,但导致更低的ppv。继续验证子宫肌瘤/子宫内膜异位症的电子病历研究是必要的,以增加对这些未充分研究的妇科疾病的研究。
{"title":"Using electronic health record data to identify incident uterine fibroids and endometriosis within a large, urban academic medical center: a validation study.","authors":"Mia Charifson, Geidily Beaton-Mata, Robyn Lipschultz, India Robinson, Simone A Sasse, Hye-Chun Hur, Shilpi-Mehta S Lee, Erinn M Hade, Linda G Kahn","doi":"10.1093/aje/kwaf058","DOIUrl":"10.1093/aje/kwaf058","url":null,"abstract":"<p><p>Electronic health records (EHRs) present opportunities to study uterine fibroids and endometriosis within diverse populations. When using EHR data, it is important to validate outcome classification via diagnosis codes. We performed a validation study of 3 approaches ([1] International Classification of Diseases-10 (ICD-10) code alone, [2] ICD-10 code + diagnostic procedure, and [3] ICD-10 code + all diagnostic information) to identify incident uterine fibroids and endometriosis patients among n = 750 NYU Langone Health 2016-2023. Chart review was used to determine the true diagnosis status. When using a binary classification system (incident vs nonincident patient), Approaches 2 and 3 had higher positive predictive values (PPVs) for uterine fibroids (0.86 and 0.87 vs 0.78) and for endometriosis (0.70 and 0.73 vs 0.66), but Approach 1 outperformed the other 2 in negative predictive values (NPVs) for both outcomes. When using a 3-level classification system (incident vs prevalent vs disease-free patients), PPV for prevalent patients was low for all approaches, while PPV/NPV of disease-free patients was generally above 0.8. Using ICD-10 codes alone yielded higher NPVs but resulted in lower PPVs compared with the other approaches. Continued validation of uterine fibroids/endometriosis EHR studies is warranted to increase research into these understudied gynecologic conditions.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"291-299"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amy B Stein, Joshua T B Williams, Laura P Hurley, Kristin Breslin, Kate Kurlandsky, Simon J Hambidge, Jennifer C Nelson, Candace C Fuller, Bradley Crane, Kayla E Hanson, Sungching C Glenn, Amelia Jazwa, Liza M Reifler
During the COVID-19 pandemic, accurate measurement of vaccination status was important for guiding prevention efforts. We assessed the accuracy of electronic health record (EHR) COVID-19 vaccination compared with survey self-reported vaccination status using data from a cross-sectional study among pregnant women and non-pregnant adults in the Vaccine Safety Datalink between 2021 and 2022, where self-report was considered the reference standard. We measured the sensitivity and specificity of EHR vaccine data compared with the self-reported measure and estimated vaccination rates from EHR data. EHR data were obtained initially in November 2021, updated in April 2022, and record reviewed in July 2022. Vaccination coverage increased in pregnant/formerly pregnant women and non-pregnant adult respondents by 23.9% and 9.2%, respectively, over 9 months. Estimates of sensitivity based on initial EHR data were 66.0% and 77.3% for pregnant women and non-pregnant people overall and between 41% and 66% for pregnant, non-Hispanic Black, and Hispanic, Spanish-speaking respondents. With matured, chart reviewed EHR data from April 2022, the sensitivity and specificity of EHR vaccine status relative to self-report were > 93%. EHR data were a reasonable source of COVID-19 vaccination status during the pandemic and showed high accuracy with self-reported data after allowing EHR data to mature.
{"title":"Accuracy of COVID-19 vaccination self-report compared with data from VSD electronic health records for pregnant women and non-pregnant adults, 2021-2022.","authors":"Amy B Stein, Joshua T B Williams, Laura P Hurley, Kristin Breslin, Kate Kurlandsky, Simon J Hambidge, Jennifer C Nelson, Candace C Fuller, Bradley Crane, Kayla E Hanson, Sungching C Glenn, Amelia Jazwa, Liza M Reifler","doi":"10.1093/aje/kwaf112","DOIUrl":"10.1093/aje/kwaf112","url":null,"abstract":"<p><p>During the COVID-19 pandemic, accurate measurement of vaccination status was important for guiding prevention efforts. We assessed the accuracy of electronic health record (EHR) COVID-19 vaccination compared with survey self-reported vaccination status using data from a cross-sectional study among pregnant women and non-pregnant adults in the Vaccine Safety Datalink between 2021 and 2022, where self-report was considered the reference standard. We measured the sensitivity and specificity of EHR vaccine data compared with the self-reported measure and estimated vaccination rates from EHR data. EHR data were obtained initially in November 2021, updated in April 2022, and record reviewed in July 2022. Vaccination coverage increased in pregnant/formerly pregnant women and non-pregnant adult respondents by 23.9% and 9.2%, respectively, over 9 months. Estimates of sensitivity based on initial EHR data were 66.0% and 77.3% for pregnant women and non-pregnant people overall and between 41% and 66% for pregnant, non-Hispanic Black, and Hispanic, Spanish-speaking respondents. With matured, chart reviewed EHR data from April 2022, the sensitivity and specificity of EHR vaccine status relative to self-report were > 93%. EHR data were a reasonable source of COVID-19 vaccination status during the pandemic and showed high accuracy with self-reported data after allowing EHR data to mature.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"515-523"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144186208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shekhar Chauhan, Dawn Carr, Miles Taylor, Amanda Sonnega
Widowhood is among the most consequential stressful events for mental health. Although certain resources have been identified as protective of mental health following widowhood, these findings are based on US samples. This study uses novel harmonized data to evaluate differences in depressive symptoms and related factors among those recently widowed (ie, within the last 2 years) in the United States (Health and Retirement Study) and India (Longitudinal Aging Study in India). We find US widows have greater elevation in depressive symptoms (-0.36 SD) than widows in India (-0.15) on average. We identify 3 protective resources for widows that are dependent on country context: having close friends vs no friends (-0.58 vs -0.13) and living with others vs alone (-0.79 vs -0.23) are both larger for widows in the United States. Self-rated health that is good, fair, or poor is related to higher depressive symptoms for widows in the United States vs India (between 0.55 and 1.12). Findings suggest protective resources among recently widowed individuals designed to protect mental health following this stressful event will require consideration of country context. In particular, social engagement-based interventions may offer more significant benefits to widows in the United States.
丧偶是对心理健康影响最大的压力事件之一。虽然已确定某些资源可保护丧偶后的精神健康,但这些发现是基于美国的样本。本研究使用新的统一数据来评估美国(健康与退休研究)和印度(印度纵向老龄化研究)最近丧偶(即在过去2年内)的抑郁症状和相关因素的差异。我们发现,美国寡妇的抑郁症状(-0.36标准差)高于印度寡妇(-0.15标准差)。我们确定了依赖于国家背景的寡妇的三种保护资源:有亲密朋友vs .没有朋友(-0.58 vs . -0.13),与他人同住vs .独自生活(-0.79 vs . -0.23)对于美国寡妇来说都更大。与印度寡妇相比,美国寡妇的自评健康状况良好、一般或较差与更高的抑郁症状相关(在0.55和1.12之间)。研究结果表明,在这种压力事件发生后,为保护新近丧偶个体的心理健康而设计的保护性资源需要考虑到国家的具体情况。特别是,以社会参与为基础的干预可能会给美国的寡妇带来更大的好处
{"title":"Differences in protective resources and risks for depressive symptoms among recent widows in the United States and India.","authors":"Shekhar Chauhan, Dawn Carr, Miles Taylor, Amanda Sonnega","doi":"10.1093/aje/kwaf210","DOIUrl":"10.1093/aje/kwaf210","url":null,"abstract":"<p><p>Widowhood is among the most consequential stressful events for mental health. Although certain resources have been identified as protective of mental health following widowhood, these findings are based on US samples. This study uses novel harmonized data to evaluate differences in depressive symptoms and related factors among those recently widowed (ie, within the last 2 years) in the United States (Health and Retirement Study) and India (Longitudinal Aging Study in India). We find US widows have greater elevation in depressive symptoms (-0.36 SD) than widows in India (-0.15) on average. We identify 3 protective resources for widows that are dependent on country context: having close friends vs no friends (-0.58 vs -0.13) and living with others vs alone (-0.79 vs -0.23) are both larger for widows in the United States. Self-rated health that is good, fair, or poor is related to higher depressive symptoms for widows in the United States vs India (between 0.55 and 1.12). Findings suggest protective resources among recently widowed individuals designed to protect mental health following this stressful event will require consideration of country context. In particular, social engagement-based interventions may offer more significant benefits to widows in the United States.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"455-463"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145111496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiologists have access to various methods to reduce bias and improve statistical efficiency in effect estimation, from standard multivariable regression to state-of-the-art doubly-robust efficient estimators paired with highly flexible, data-adaptive algorithms ("machine learning"). However, due to numerous assumptions and trade-offs, epidemiologists face practical difficulties in recognizing which method, if any, may be suitable for their specific data and hypotheses. Importantly, relative advantages are necessarily context-specific (data structure, algorithms, model misspecification), limiting the utility of universal guidance. Evaluating performance through real-data-based simulations is useful but out-of-reach for many epidemiologists. We present a user-friendly, offline Shiny app REFINE2 (Realistic Evaluations of Finite sample INference using Efficient Estimators) that enables analysts to input their own data and quickly compare the performance of different algorithms within their data context in estimating a prespecified average treatment effect (ATE). REFINE2 automates plasmode simulation of a plausible target ATE given observed covariates and then examines bias and confidence interval coverage (relative to this target) given user-specified models. We present an extensive case study to illustrate how REFINE2 can be used to guide analyses within epidemiologist's own data under three typical scenarios: residual confounding; spurious covariates; and mis-specified effect modification. As expected, the apparent best method differed across scenarios and are suboptimal under residual confounding. REFINE2 may help epidemiologists not only chose amongst imperfect models, but also better understand common underappreciated problems, such as finite sample bias using machine learning.
{"title":"REFINE2: a simplified simulation tool to help epidemiologists evaluate the suitability and sensitivity of effect estimation within user-specified data.","authors":"Xiang Meng, Jonathan Y Huang","doi":"10.1093/aje/kwaf195","DOIUrl":"10.1093/aje/kwaf195","url":null,"abstract":"<p><p>Epidemiologists have access to various methods to reduce bias and improve statistical efficiency in effect estimation, from standard multivariable regression to state-of-the-art doubly-robust efficient estimators paired with highly flexible, data-adaptive algorithms (\"machine learning\"). However, due to numerous assumptions and trade-offs, epidemiologists face practical difficulties in recognizing which method, if any, may be suitable for their specific data and hypotheses. Importantly, relative advantages are necessarily context-specific (data structure, algorithms, model misspecification), limiting the utility of universal guidance. Evaluating performance through real-data-based simulations is useful but out-of-reach for many epidemiologists. We present a user-friendly, offline Shiny app REFINE2 (Realistic Evaluations of Finite sample INference using Efficient Estimators) that enables analysts to input their own data and quickly compare the performance of different algorithms within their data context in estimating a prespecified average treatment effect (ATE). REFINE2 automates plasmode simulation of a plausible target ATE given observed covariates and then examines bias and confidence interval coverage (relative to this target) given user-specified models. We present an extensive case study to illustrate how REFINE2 can be used to guide analyses within epidemiologist's own data under three typical scenarios: residual confounding; spurious covariates; and mis-specified effect modification. As expected, the apparent best method differed across scenarios and are suboptimal under residual confounding. REFINE2 may help epidemiologists not only chose amongst imperfect models, but also better understand common underappreciated problems, such as finite sample bias using machine learning.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":"533-542"},"PeriodicalIF":4.8,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144938769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}