Pub Date : 2026-01-31DOI: 10.1016/j.annepidem.2026.01.016
Angela D Liese, Brian E Dixon, Tessa Crume, Jasmin Divers, Yi Guo, Annemarie G Hirsch, Kristi Reynolds, Levon Utidjian, Ibrahim Zaganjor, Marc Rosenman
Purpose: A critical function of public health is to monitor diseases that impede quality of life and burden affected communities. The Diabetes in Children, Adolescents and Young Adults (DiCAYA) Network aims to advance disease monitoring for diabetes using multi-site electronic health record (EHR) data.
Methods: This work involved validating and refining case definitions for accurate identification of type 1 and type 2 diabetes cases to estimate incidence and prevalence of diabetes in children, adolescents, and young adults through age 44 years.
Results: In this essay, we describe the challenges experienced by the Network and lessons learned. Challenges included accessing EHR data, harmonizing EHR data from heterogeneous health systems to a common data model, and developing methods to account for bias introduced by the non-representativeness of health care utilization data. Lessons learned included approaches for data quality assessment, bias correction, and scalability.
Conclusions: As the US continues to evolve its public health data systems and its approach to chronic disease monitoring, the DiCAYA Network offers guidance on factors for success as well as pitfalls to avoid.
{"title":"Public Health Monitoring of Diabetes in the Era of Electronic Health Records: Insights from the Diabetes in Children, Adolescents and Young Adults (DiCAYA) Network.","authors":"Angela D Liese, Brian E Dixon, Tessa Crume, Jasmin Divers, Yi Guo, Annemarie G Hirsch, Kristi Reynolds, Levon Utidjian, Ibrahim Zaganjor, Marc Rosenman","doi":"10.1016/j.annepidem.2026.01.016","DOIUrl":"https://doi.org/10.1016/j.annepidem.2026.01.016","url":null,"abstract":"<p><strong>Purpose: </strong>A critical function of public health is to monitor diseases that impede quality of life and burden affected communities. The Diabetes in Children, Adolescents and Young Adults (DiCAYA) Network aims to advance disease monitoring for diabetes using multi-site electronic health record (EHR) data.</p><p><strong>Methods: </strong>This work involved validating and refining case definitions for accurate identification of type 1 and type 2 diabetes cases to estimate incidence and prevalence of diabetes in children, adolescents, and young adults through age 44 years.</p><p><strong>Results: </strong>In this essay, we describe the challenges experienced by the Network and lessons learned. Challenges included accessing EHR data, harmonizing EHR data from heterogeneous health systems to a common data model, and developing methods to account for bias introduced by the non-representativeness of health care utilization data. Lessons learned included approaches for data quality assessment, bias correction, and scalability.</p><p><strong>Conclusions: </strong>As the US continues to evolve its public health data systems and its approach to chronic disease monitoring, the DiCAYA Network offers guidance on factors for success as well as pitfalls to avoid.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":""},"PeriodicalIF":3.0,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-31DOI: 10.1016/j.annepidem.2026.01.017
Greta Jianjia Cheng, Christina F Mair, Jeanine M Buchanich, Tiffany L Gary-Webb, C Elizabeth Shaaban, Andrea L Rosso
Purpose: Evidence regarding neighborhood socioeconomic status (nSES) as an upstream determinant of cognitive outcomes has largely lacked a life-course perspective. We examined racial differences in the associations between midlife and late-life nSES and cognitive function in a cohort of 330 Black and White older Americans aged 70+.
Methods: General cognitive function was measured using Modified Mini-Mental State Examination up to a 15-year follow-up. Midlife (age 49-58) and late-life (age 70-79) nSES scores were z-standardized based on five census indicators of tract-level socioeconomic characteristics. Mixed-effects linear regression examined the associations between midlife and late-life nSES and cognitive function.
Results: Higher midlife nSES was associated with higher baseline levels of cognitive function among Black (β: 3.10, 95% CI: 0.85, 5.33), but not among White participants (β: 0.51, 95% CI: -0.88, 1.90; p for interaction: 0.037). There were no observed associations between midlife nSES and changes in cognitive function in the overall sample or in either racial group. Late-life nSES was not associated with baseline levels of cognitive function or changes in the overall sample or either racial group.
Conclusions: Midlife may be a critical period in which neighborhood socioeconomic exposure has a greater impact on late-life cognitive health, particularly for Black individuals.
{"title":"Midlife and Late-life Neighborhood Socioeconomic Status and Cognitive Function in Later life: Differences by race.","authors":"Greta Jianjia Cheng, Christina F Mair, Jeanine M Buchanich, Tiffany L Gary-Webb, C Elizabeth Shaaban, Andrea L Rosso","doi":"10.1016/j.annepidem.2026.01.017","DOIUrl":"https://doi.org/10.1016/j.annepidem.2026.01.017","url":null,"abstract":"<p><strong>Purpose: </strong>Evidence regarding neighborhood socioeconomic status (nSES) as an upstream determinant of cognitive outcomes has largely lacked a life-course perspective. We examined racial differences in the associations between midlife and late-life nSES and cognitive function in a cohort of 330 Black and White older Americans aged 70+.</p><p><strong>Methods: </strong>General cognitive function was measured using Modified Mini-Mental State Examination up to a 15-year follow-up. Midlife (age 49-58) and late-life (age 70-79) nSES scores were z-standardized based on five census indicators of tract-level socioeconomic characteristics. Mixed-effects linear regression examined the associations between midlife and late-life nSES and cognitive function.</p><p><strong>Results: </strong>Higher midlife nSES was associated with higher baseline levels of cognitive function among Black (β: 3.10, 95% CI: 0.85, 5.33), but not among White participants (β: 0.51, 95% CI: -0.88, 1.90; p for interaction: 0.037). There were no observed associations between midlife nSES and changes in cognitive function in the overall sample or in either racial group. Late-life nSES was not associated with baseline levels of cognitive function or changes in the overall sample or either racial group.</p><p><strong>Conclusions: </strong>Midlife may be a critical period in which neighborhood socioeconomic exposure has a greater impact on late-life cognitive health, particularly for Black individuals.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":""},"PeriodicalIF":3.0,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.annepidem.2026.01.011
Yu He, Chanapong Rojanaworarit
Purpose: To compare seven machine learning (ML) models developed to predict non-response to the sexual identity question in the 2023 Youth Risk Behavior Surveillance System (YRBSS) and identify the best-performing ML model, along with key attributes associated with the non-response.
Methods: Data of 20,103 students, with 32 predictors across domains of personal characteristics, school behavior, substance use, and sexual activity were analyzed. Supervised ML models-including random forest (RF), gradient boosting, extreme gradient boosting, decision tree, neural network, lasso, and elastic net were developed and incorporated survey weights. Performance was assessed using F1 score, area under the ROC curve (AUC), and area under the precision-recall curve (AUPRC).
Results: About 10% of students didn't respond to the sexual identity question, with higher rates among racial/ethnic minorities, including American Indian/Alaska Native and Native Hawaiian/Pacific Islander youths. RF model showed the most robust overall performance across all metrics. Attributes predicting non-response included response status to questions of school absence due to safety concerns and having ≥4 sexual partners.
Conclusions: Non-response was non-random and concentrated among vulnerable groups. Predictive performance was strong, but findings suggest that response patterns to other sensitive survey items play substantial role, with implications for survey design and non-response adjustment.
{"title":"Predicting Nonresponse to Sexual Identity Question in Youth Risk Behavior Surveillance: A Machine Learning Analysis of Complex Survey Data.","authors":"Yu He, Chanapong Rojanaworarit","doi":"10.1016/j.annepidem.2026.01.011","DOIUrl":"https://doi.org/10.1016/j.annepidem.2026.01.011","url":null,"abstract":"<p><strong>Purpose: </strong>To compare seven machine learning (ML) models developed to predict non-response to the sexual identity question in the 2023 Youth Risk Behavior Surveillance System (YRBSS) and identify the best-performing ML model, along with key attributes associated with the non-response.</p><p><strong>Methods: </strong>Data of 20,103 students, with 32 predictors across domains of personal characteristics, school behavior, substance use, and sexual activity were analyzed. Supervised ML models-including random forest (RF), gradient boosting, extreme gradient boosting, decision tree, neural network, lasso, and elastic net were developed and incorporated survey weights. Performance was assessed using F1 score, area under the ROC curve (AUC), and area under the precision-recall curve (AUPRC).</p><p><strong>Results: </strong>About 10% of students didn't respond to the sexual identity question, with higher rates among racial/ethnic minorities, including American Indian/Alaska Native and Native Hawaiian/Pacific Islander youths. RF model showed the most robust overall performance across all metrics. Attributes predicting non-response included response status to questions of school absence due to safety concerns and having ≥4 sexual partners.</p><p><strong>Conclusions: </strong>Non-response was non-random and concentrated among vulnerable groups. Predictive performance was strong, but findings suggest that response patterns to other sensitive survey items play substantial role, with implications for survey design and non-response adjustment.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":""},"PeriodicalIF":3.0,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146100960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.annepidem.2026.01.014
Afroza Parvin, Rebecca D Kehm, Baozhen Qiao, James E Cone, Mark R Farfel, Rachel Zeig-Owens, David G Goldfarb, Moshe Z Shapiro, Andrew C Todd, Tabassum Insaf, Charles B Hall, Paolo Boffetta, Jiehui Li
Purpose: The World Trade Center Health Program (WTCHP) plays a critical role in medical monitoring and treatment to those exposed to the terrorist attacks of September 11, 2001 (9/11). We investigated the association of WTCHP membership with mortality risk among 9/11 responders while controlling for comorbidities using inverse probability weighting.
Methods: We prospectively analyzed 28,430 9/11 responders, followed from the time of their enrollment into the WTCHP or the WTC Health Registry, through 2020. NDI linkage provided death data. Non-cancer comorbidities were self-reported physician-diagnosis and cancer was identified through cancer registry linkage. We estimated the adjusted hazard ratio (aHR) with 95 % confidence interval (CI) for the association between WTCHP membership and all-cause and cause-specific mortality using Cox proportional hazards models and cause-specific hazard regression models, respectively.
Results: A total of 1657 deaths were identified over 444,425 person-years of follow-up. Compared to non-members, WTCHP members had a lower risk of all-cause mortality (aHR=0.87; 95 % CI=0.77-0.98) and smoking-related mortality (aHR=0.83; 0.69-0.99) after adjusting for demographics, WTC exposure, and weights of comorbidities. With the membership-sex interaction included, reduced risk of all-cause mortality remained statistically significant among males only (aHR=0.85; 0.75-0.96). Cancer- and heart-related mortality risk were not significantly different between WTCHP members and non-members.
Conclusions: This study found that WTCHP membership may reduce risks of all-cause and smoking-related mortality among 9/11 responders, even after accounting for underlying medical conditions, underscoring the importance of comprehensive health monitoring and treatment services for disaster-relief workers.
{"title":"Effect of World Trade Center Health Program on mortality among 9/11 responders.","authors":"Afroza Parvin, Rebecca D Kehm, Baozhen Qiao, James E Cone, Mark R Farfel, Rachel Zeig-Owens, David G Goldfarb, Moshe Z Shapiro, Andrew C Todd, Tabassum Insaf, Charles B Hall, Paolo Boffetta, Jiehui Li","doi":"10.1016/j.annepidem.2026.01.014","DOIUrl":"10.1016/j.annepidem.2026.01.014","url":null,"abstract":"<p><strong>Purpose: </strong>The World Trade Center Health Program (WTCHP) plays a critical role in medical monitoring and treatment to those exposed to the terrorist attacks of September 11, 2001 (9/11). We investigated the association of WTCHP membership with mortality risk among 9/11 responders while controlling for comorbidities using inverse probability weighting.</p><p><strong>Methods: </strong>We prospectively analyzed 28,430 9/11 responders, followed from the time of their enrollment into the WTCHP or the WTC Health Registry, through 2020. NDI linkage provided death data. Non-cancer comorbidities were self-reported physician-diagnosis and cancer was identified through cancer registry linkage. We estimated the adjusted hazard ratio (aHR) with 95 % confidence interval (CI) for the association between WTCHP membership and all-cause and cause-specific mortality using Cox proportional hazards models and cause-specific hazard regression models, respectively.</p><p><strong>Results: </strong>A total of 1657 deaths were identified over 444,425 person-years of follow-up. Compared to non-members, WTCHP members had a lower risk of all-cause mortality (aHR=0.87; 95 % CI=0.77-0.98) and smoking-related mortality (aHR=0.83; 0.69-0.99) after adjusting for demographics, WTC exposure, and weights of comorbidities. With the membership-sex interaction included, reduced risk of all-cause mortality remained statistically significant among males only (aHR=0.85; 0.75-0.96). Cancer- and heart-related mortality risk were not significantly different between WTCHP members and non-members.</p><p><strong>Conclusions: </strong>This study found that WTCHP membership may reduce risks of all-cause and smoking-related mortality among 9/11 responders, even after accounting for underlying medical conditions, underscoring the importance of comprehensive health monitoring and treatment services for disaster-relief workers.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":"8-14"},"PeriodicalIF":3.0,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.annepidem.2026.01.013
Romain Brisson
Purpose: This study examined how careless and inconsistent reporting affects adolescent suicidality prevalence and sex differences, a methodological issue often overlooked in self-report epidemiological research.
Methods: I used data from two nationally representative surveys of secondary-school students conducted in 2010 (n = 7640; 49.3 % female) and 2014 (n = 5592; 52.6 % female). Both surveys assessed depressive symptoms, suicidal ideation, suicide plans, suicide attempts, attempt recognition, and attempt disclosure. Three methods of prevalence computation were used: unadjusted estimates (M1); excluding fictitious drug endorsers and treating inconsistencies as missing (M2); and excluding all careless and inconsistent reporters (M3).
Results: About 19 % of respondents were identified as careless or inconsistent. Compared to M1, M2 and M3 yielded lower prevalence estimates for most indicators. The largest reductions involved, on average, reports of unnoticed suicide attempts (-73.8 %), talking to no one about an attempt (-73.3 %), and reporting six or more suicide attempts (-35.9 %). Most sex differences were unaffected, except for the 'six or more suicide attempts' category and attempt recognition and disclosure items.
Conclusions: Overlooking misreporting may inflate adolescent suicidality prevalence and distort sex-difference estimates. Incorporating validity checks and data-cleaning procedures can improve the accuracy of epidemiological findings and the effectiveness of prevention programs.
{"title":"Careless and inconsistent reporting inflates suicidality prevalence and biases sex differences.","authors":"Romain Brisson","doi":"10.1016/j.annepidem.2026.01.013","DOIUrl":"10.1016/j.annepidem.2026.01.013","url":null,"abstract":"<p><strong>Purpose: </strong>This study examined how careless and inconsistent reporting affects adolescent suicidality prevalence and sex differences, a methodological issue often overlooked in self-report epidemiological research.</p><p><strong>Methods: </strong>I used data from two nationally representative surveys of secondary-school students conducted in 2010 (n = 7640; 49.3 % female) and 2014 (n = 5592; 52.6 % female). Both surveys assessed depressive symptoms, suicidal ideation, suicide plans, suicide attempts, attempt recognition, and attempt disclosure. Three methods of prevalence computation were used: unadjusted estimates (M1); excluding fictitious drug endorsers and treating inconsistencies as missing (M2); and excluding all careless and inconsistent reporters (M3).</p><p><strong>Results: </strong>About 19 % of respondents were identified as careless or inconsistent. Compared to M1, M2 and M3 yielded lower prevalence estimates for most indicators. The largest reductions involved, on average, reports of unnoticed suicide attempts (-73.8 %), talking to no one about an attempt (-73.3 %), and reporting six or more suicide attempts (-35.9 %). Most sex differences were unaffected, except for the 'six or more suicide attempts' category and attempt recognition and disclosure items.</p><p><strong>Conclusions: </strong>Overlooking misreporting may inflate adolescent suicidality prevalence and distort sex-difference estimates. Incorporating validity checks and data-cleaning procedures can improve the accuracy of epidemiological findings and the effectiveness of prevention programs.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":"23-27"},"PeriodicalIF":3.0,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.annepidem.2026.01.007
Longjian Liu, Jintong Hou
Purpose: We aimed to identify key midlife dementia predictors and develop a novel machine learning (ML) -enabled risk prediction model.
Methods: Using data from 9266 Atherosclerosis Risk in Communities study participants (aged 45-64 years at baseline, 1987-1989). Incident dementia was ascertained through December 2019. A ML-based LASSO-Cox model was applied to develop the risk prediction model.
Results: Over a 25-year mean follow-up, 2010 participants developed dementia. The LASSO-Cox model identified 12 key predictors and achieved C-indices (95 %CI) of 0.77 (0.75-0.79) in the training set (n = 6182) and 0.78 (0.76-0.81) in the test set (n = 3084). Predictors included age, Digit Symbol Substitution Test, apolipoprotein E ε4, HbA1c, brachial blood pressure, Factor VIII, Delayed Word Recall Test, hypertension, stroke history, C-reactive protein, white blood cell count, and apolipoprotein B. The resulting nomogram demonstrated strong discrimination (AUC 0.77-0.86) and good calibration. LASSO-Cox risk score quartiles effectively stratified participants into low, moderate, high, and very high dementia risk groups.
Conclusions: The findings demonstrate that the newly developed machine learning-based LASSO-Cox model provides a robust method to predict individuals at high risk of dementia.
{"title":"Machine learning-based LASSO-Cox model for dementia prediction: The role of midlife cardiometabolic, inflammatory, and genetic risk factors in a US cohort.","authors":"Longjian Liu, Jintong Hou","doi":"10.1016/j.annepidem.2026.01.007","DOIUrl":"10.1016/j.annepidem.2026.01.007","url":null,"abstract":"<p><strong>Purpose: </strong>We aimed to identify key midlife dementia predictors and develop a novel machine learning (ML) -enabled risk prediction model.</p><p><strong>Methods: </strong>Using data from 9266 Atherosclerosis Risk in Communities study participants (aged 45-64 years at baseline, 1987-1989). Incident dementia was ascertained through December 2019. A ML-based LASSO-Cox model was applied to develop the risk prediction model.</p><p><strong>Results: </strong>Over a 25-year mean follow-up, 2010 participants developed dementia. The LASSO-Cox model identified 12 key predictors and achieved C-indices (95 %CI) of 0.77 (0.75-0.79) in the training set (n = 6182) and 0.78 (0.76-0.81) in the test set (n = 3084). Predictors included age, Digit Symbol Substitution Test, apolipoprotein E ε4, HbA1c, brachial blood pressure, Factor VIII, Delayed Word Recall Test, hypertension, stroke history, C-reactive protein, white blood cell count, and apolipoprotein B. The resulting nomogram demonstrated strong discrimination (AUC 0.77-0.86) and good calibration. LASSO-Cox risk score quartiles effectively stratified participants into low, moderate, high, and very high dementia risk groups.</p><p><strong>Conclusions: </strong>The findings demonstrate that the newly developed machine learning-based LASSO-Cox model provides a robust method to predict individuals at high risk of dementia.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":"28-36"},"PeriodicalIF":3.0,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146046999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1016/j.annepidem.2026.01.008
Angela D'Adamo, Amii M Kress, Rima Habre, Nissa Towe-Goodman, Michael R Desjardins, Akram Alshawabkeh, Izzuddin M Aris, Carlos A Camargo, Kecia N Carroll, Andrea E Cassidy-Bushrow, Su H Chu, Yolaine Civil, Alexandrea L Craft, Lisa A Croen, Sean Deoni, Viren Dsa, Anne L Dunlop, Amy J Elliott, Assiamira Ferrara, Jody M Ganiban, Akhgar Ghassabian, Tina Hartert, Delma-Jean Watts, Margaret R Karagas, Catherine J Karr, Daphne Koinis-Mitchell, Michael Kramer, Cindy T McEvoy, Hooman Mirzakhani, Thomas G O'Connor, Wei Perng, Rebecca J Schmidt, Uzma Shah, Irene Tung, Rosalind J Wright, Emily A Knapp
Purpose: To examine factors associated with moving during pregnancy and impacts of assigning nSES at enrollment, delivery, or a time-weighted average on birth outcomes (birthweight, birthweight-for-gestational-age z-score, low birthweight, gestational age, small-for-gestational age, preterm birth).
Methods: We used data from the Environmental influences on Child Health Outcomes (ECHO) Cohort Study (2010-2019) with nSES data from the American Community Survey (ACS) matched by time and location to monthly residential histories. We used multivariable logistic models with Generalized Estimating Equations to identify factors associated with moving and quantify exposure misclassification in model estimates.
Results: Approximately 7 % of 15,376 participants moved at least once during pregnancy. Maternal age (OR: 0.97, 95 % CI: 0.95, 0.98) and other race vs. White (OR: 0.39, 95 % CI: 0.20, 0.80) were associated with lower odds of moving; lower neighborhood-level education (OR: 1.34, 95 % CI: 1.11, 1.62) and living in urban neighborhoods (OR: 3.03, 95 % CI: 1.39, 6.59) were associated with higher odds. Among movers, estimates between nSES and birth outcomes changed ≥ 16 % by address assignment; birthweight-for-gestational-age z-score was significant only when using nSES at delivery.
Conclusion: Sociodemographic and nSES characteristics are associated with moving during pregnancy; movers may experience exposure misclassification and underestimated effects on birth outcomes.
{"title":"Residential mobility during pregnancy and birth outcomes in the United States: The environmental influences on Child Health Outcomes (ECHO) Cohort (2010-2019).","authors":"Angela D'Adamo, Amii M Kress, Rima Habre, Nissa Towe-Goodman, Michael R Desjardins, Akram Alshawabkeh, Izzuddin M Aris, Carlos A Camargo, Kecia N Carroll, Andrea E Cassidy-Bushrow, Su H Chu, Yolaine Civil, Alexandrea L Craft, Lisa A Croen, Sean Deoni, Viren Dsa, Anne L Dunlop, Amy J Elliott, Assiamira Ferrara, Jody M Ganiban, Akhgar Ghassabian, Tina Hartert, Delma-Jean Watts, Margaret R Karagas, Catherine J Karr, Daphne Koinis-Mitchell, Michael Kramer, Cindy T McEvoy, Hooman Mirzakhani, Thomas G O'Connor, Wei Perng, Rebecca J Schmidt, Uzma Shah, Irene Tung, Rosalind J Wright, Emily A Knapp","doi":"10.1016/j.annepidem.2026.01.008","DOIUrl":"10.1016/j.annepidem.2026.01.008","url":null,"abstract":"<p><strong>Purpose: </strong>To examine factors associated with moving during pregnancy and impacts of assigning nSES at enrollment, delivery, or a time-weighted average on birth outcomes (birthweight, birthweight-for-gestational-age z-score, low birthweight, gestational age, small-for-gestational age, preterm birth).</p><p><strong>Methods: </strong>We used data from the Environmental influences on Child Health Outcomes (ECHO) Cohort Study (2010-2019) with nSES data from the American Community Survey (ACS) matched by time and location to monthly residential histories. We used multivariable logistic models with Generalized Estimating Equations to identify factors associated with moving and quantify exposure misclassification in model estimates.</p><p><strong>Results: </strong>Approximately 7 % of 15,376 participants moved at least once during pregnancy. Maternal age (OR: 0.97, 95 % CI: 0.95, 0.98) and other race vs. White (OR: 0.39, 95 % CI: 0.20, 0.80) were associated with lower odds of moving; lower neighborhood-level education (OR: 1.34, 95 % CI: 1.11, 1.62) and living in urban neighborhoods (OR: 3.03, 95 % CI: 1.39, 6.59) were associated with higher odds. Among movers, estimates between nSES and birth outcomes changed ≥ 16 % by address assignment; birthweight-for-gestational-age z-score was significant only when using nSES at delivery.</p><p><strong>Conclusion: </strong>Sociodemographic and nSES characteristics are associated with moving during pregnancy; movers may experience exposure misclassification and underestimated effects on birth outcomes.</p>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":" ","pages":"15-22"},"PeriodicalIF":3.0,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146004675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1016/j.annepidem.2026.01.004
Peter M. Socha , Maryam Oskoui , Jennifer A. Hutcheon , Sam Harper
Purpose
To improve the identification of cerebral palsy cases in administrative health data.
Methods
We included all children in a population-based cerebral palsy registry in Quebec, Canada, born from 1999 through 2002, and a sample of children without cerebral palsy. Population-based hospitalization and physician billing records through 2012 were obtained for all children. We used logistic regression to model the probability of cerebral palsy, using International Classification of Diseases codes for related diseases. We reported receiver operating characteristic (ROC) and precision-recall (PR) curves, and compared the accuracy to that of existing algorithms. We also reported the accuracy of cerebral palsy codes by age, data source, and gestational age at birth.
Results
The area under the ROC and PR curves of our model were 0.98 (95 % CI: 0.97–0.99) and 0.73 (95 % CI: 0.63–0.79), respectively. Cut-offs with a similar specificity to existing algorithms yielded sensitivities that were 1–14 %age-points higher. The sensitivity of cerebral palsy codes was higher (and the specificity was lower) with longer follow-up times since birth, when using both hospitalization and billing records, and among children born preterm.
Conclusions
Our model improved identification of cerebral palsy cases in administrative data, but residual misclassification remained.
{"title":"A multivariable model for improving the identification of cerebral palsy cases in administrative health data","authors":"Peter M. Socha , Maryam Oskoui , Jennifer A. Hutcheon , Sam Harper","doi":"10.1016/j.annepidem.2026.01.004","DOIUrl":"10.1016/j.annepidem.2026.01.004","url":null,"abstract":"<div><h3>Purpose</h3><div>To improve the identification of cerebral palsy cases in administrative health data.</div></div><div><h3>Methods</h3><div>We included all children in a population-based cerebral palsy registry in Quebec, Canada, born from 1999 through 2002, and a sample of children without cerebral palsy. Population-based hospitalization and physician billing records through 2012 were obtained for all children. We used logistic regression to model the probability of cerebral palsy, using International Classification of Diseases codes for related diseases. We reported receiver operating characteristic (ROC) and precision-recall (PR) curves, and compared the accuracy to that of existing algorithms. We also reported the accuracy of cerebral palsy codes by age, data source, and gestational age at birth.</div></div><div><h3>Results</h3><div>The area under the ROC and PR curves of our model were 0.98 (95 % CI: 0.97–0.99) and 0.73 (95 % CI: 0.63–0.79), respectively. Cut-offs with a similar specificity to existing algorithms yielded sensitivities that were 1–14 %age-points higher. The sensitivity of cerebral palsy codes was higher (and the specificity was lower) with longer follow-up times since birth, when using both hospitalization and billing records, and among children born preterm.</div></div><div><h3>Conclusions</h3><div>Our model improved identification of cerebral palsy cases in administrative data, but residual misclassification remained.</div></div>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":"114 ","pages":"Pages 26-31"},"PeriodicalIF":3.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}