首页 > 最新文献

JAMIA Open最新文献

英文 中文
A preliminary evaluation of AFibrisk: digital decision-support platform for atrial fibrillation risk assessment after cryptogenic stroke-a cross-sectional concordance study. AFibrisk的初步评价:隐源性卒中后房颤风险评估的数字决策支持平台-横断面一致性研究
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-11 eCollection Date: 2026-02-01 DOI: 10.1093/jamiaopen/ooag001
João Brainer Clares de Andrade, Rafael P Gomes, Alexandre Cristiuma Robles, Thales Pardini Fagundes, George N Nunes Mendes

Objectives: Identifying patients at high risk for atrial fibrillation (AF) after cryptogenic stroke remains a challenge, particularly in settings with limited access to long-term cardiac monitoring. The AFibrisk platform, a free digital decision-support tool, integrates 19 validated AF prediction scores to support post-stroke triage. We aimed to assess the concordance of AFibrisk-supported classification decisions with expert electrophysiologist consensus and compare performance across evaluator groups with different levels of clinical experience.

Materials and methods: A prospective, cross-sectional concordance study was conducted using 29 standardized clinical vignettes. Evaluators-3 vascular neurologists, 4 cardiology residents, and 11 neurology residents-classified each case as high or low AF risk using AFibrisk outputs. Expert consensus served as the reference standard. Statistical analyses included inter-group comparisons, inter-rater reliability, and regression models adjusting for group size and response clustering.

Results: Vascular neurologists demonstrated the highest agreement with the reference standard (mean 90.3%), followed by cardiology residents (85.2%) and neurology residents (77.5%). Differences were statistically significant (ANOVA p = .0199; Kruskal-Wallis p = .0259). Neurology residents showed the greatest intra-group consistency (Light's κ = 0.607), despite lower accuracy. Classification errors differed by experience: residents tended to overestimate risk, while experts showed occasional underestimation. Overall, 30.1% of responses were "not classified," with the highest uncertainty among vascular neurologists (43.8%).

Discussion and conclusion: AFibrisk improved alignment with expert judgment across evaluator groups and helped standardize decision-making. Our free platform may support AF risk stratification in low-resource environments and reinforce evidence-based heuristics among early-career clinicians, and it is available at www.afibrisk.net.

目的:确定隐源性卒中后心房颤动(AF)高风险患者仍然是一个挑战,特别是在长期心脏监测有限的情况下。AFibrisk平台是一个免费的数字决策支持工具,集成了19个经过验证的房颤预测评分,以支持卒中后分诊。我们旨在评估afifisk支持的分类决策与电生理学专家共识的一致性,并比较具有不同临床经验水平的评估者组的表现。材料和方法:一项前瞻性、横断面一致性研究使用29个标准化临床小样本进行。评估人员——3名血管神经科医生、4名心脏病住院医生和11名神经内科住院医生——使用AFibrisk输出将每个病例分为高或低心房颤动风险。专家共识作为参考标准。统计分析包括组间比较、组间信度和根据组大小和反应聚类调整的回归模型。结果:血管神经科医师与参考标准的一致性最高(平均90.3%),其次是心脏病科(85.2%)和神经内科(77.5%)。差异有统计学意义(方差分析p = 0.0199; Kruskal-Wallis p = 0.0259)。尽管准确率较低,但神经内科住院医生表现出最大的组内一致性(Light’s κ = 0.607)。分类错误因经验而异:居民倾向于高估风险,而专家偶尔会低估风险。总体而言,30.1%的回答“未分类”,其中血管神经科医生的不确定性最高(43.8%)。讨论和结论:AFibrisk提高了评估小组与专家判断的一致性,并有助于标准化决策。我们的免费平台可以支持低资源环境下的房颤风险分层,并加强早期职业临床医生的循证启发式,该平台可在www.afibrisk.net上获得。
{"title":"A preliminary evaluation of AFibrisk: digital decision-support platform for atrial fibrillation risk assessment after cryptogenic stroke-a cross-sectional concordance study.","authors":"João Brainer Clares de Andrade, Rafael P Gomes, Alexandre Cristiuma Robles, Thales Pardini Fagundes, George N Nunes Mendes","doi":"10.1093/jamiaopen/ooag001","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooag001","url":null,"abstract":"<p><strong>Objectives: </strong>Identifying patients at high risk for atrial fibrillation (AF) after cryptogenic stroke remains a challenge, particularly in settings with limited access to long-term cardiac monitoring. The AFibrisk platform, a free digital decision-support tool, integrates 19 validated AF prediction scores to support post-stroke triage. We aimed to assess the concordance of AFibrisk-supported classification decisions with expert electrophysiologist consensus and compare performance across evaluator groups with different levels of clinical experience.</p><p><strong>Materials and methods: </strong>A prospective, cross-sectional concordance study was conducted using 29 standardized clinical vignettes. Evaluators-3 vascular neurologists, 4 cardiology residents, and 11 neurology residents-classified each case as high or low AF risk using AFibrisk outputs. Expert consensus served as the reference standard. Statistical analyses included inter-group comparisons, inter-rater reliability, and regression models adjusting for group size and response clustering.</p><p><strong>Results: </strong>Vascular neurologists demonstrated the highest agreement with the reference standard (mean 90.3%), followed by cardiology residents (85.2%) and neurology residents (77.5%). Differences were statistically significant (ANOVA <i>p</i> = .0199; Kruskal-Wallis <i>p</i> = .0259). Neurology residents showed the greatest intra-group consistency (Light's κ = 0.607), despite lower accuracy. Classification errors differed by experience: residents tended to overestimate risk, while experts showed occasional underestimation. Overall, 30.1% of responses were \"not classified,\" with the highest uncertainty among vascular neurologists (43.8%).</p><p><strong>Discussion and conclusion: </strong>AFibrisk improved alignment with expert judgment across evaluator groups and helped standardize decision-making. Our free platform may support AF risk stratification in low-resource environments and reinforce evidence-based heuristics among early-career clinicians, and it is available at www.afibrisk.net.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"9 1","pages":"ooag001"},"PeriodicalIF":3.4,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12924629/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing cause of death prediction: development and validation of machine learning models using multimodal data across multiple health-care sites. 加强死因预测:开发和验证使用跨多个医疗站点的多模式数据的机器学习模型。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-08 eCollection Date: 2026-02-01 DOI: 10.1093/jamiaopen/ooaf175
Mohammed Al-Garadi, Rishi J Desai, Kerry Ngan, Michele LeNoue-Newton, Ruth M Reeves, Daniel Park, Jose J Hernández-Muñoz, Shirley V Wang, Judith C Maro, Candace C Fuller, Joshua Lin Kueiyu, Aida Kuzucan, Kevin Coughlin, Haritha Pillai, Melissa McPheeters, Jill Whitaker, Jessica A Buckner, Michael F McLemore, Dax M Westerman, Michael E Matheny

Objectives: To develop and validate machine learning (ML) models that predict probable cause of death (CoD) using structured electronic health record (EHR) data, unstructured clinical notes, and publicly available sources.

Materials and methods: This multi-institutional retrospective study was conducted across Vanderbilt University Medical Center (VUMC) and Massachusetts General Brigham (MGB), including deceased patients with encounters between October 1, 2015, and January 1, 2021, and confirmed death records. The cohort included 13 708 patients from VUMC and 34 839 from MGB.The primary outcome was underlying CoD categorized into the top 15 National Center for Health Statistics rankable causes, with others grouped as "Other." Performance was assessed using weighted area under the receiver operating characteristic curve (AUC) and F-measure.

Results: The XGBoost model using structured EHR data alone achieved weighted AUCs of 0.86 (95% CI, 0.84-0.88) at VUMC and 0.80 (95% CI, 0.79-0.80) at MGB. Adding unstructured notes improved performance, with weighted AUCs of 0.90 (95% CI, 0.88-0.93) at VUMC and 0.92 (95% CI, 0.91-0.92) at MGB. Adding publicly available data did not further improve performance. Cross-institutional validation revealed significant performance degradation.

Discussion: Models integrating structured and unstructured EHR data show strong within-institution performance but limited generalizability across healthcare systems, highlighting challenges related to institutional data heterogeneity.

Conclusions: Machine learning models combining structured and unstructured EHR data accurately predict CoD within institutions but perform poorly across sites. Health-care institutions may benefit from adopting robust processes for locally tailored models, and future research should focus on enhancing model generalizability while addressing unique institutional data environments.

目的:开发和验证机器学习(ML)模型,利用结构化电子健康记录(EHR)数据、非结构化临床记录和公开来源预测可能的死亡原因(CoD)。材料和方法:这项多机构回顾性研究在范德比尔特大学医学中心(VUMC)和马萨诸塞州布里格姆总医院(MGB)进行,包括2015年10月1日至2021年1月1日期间遭遇的死亡患者,以及确认的死亡记录。该队列包括来自VUMC的13 708名患者和来自MGB的34 839名患者。主要结果是潜在的CoD被归类为国家卫生统计中心排名前15位的原因,其他的被归类为“其他”。使用受者工作特征曲线(AUC)下的加权面积和F-measure来评估性能。结果:单独使用结构化EHR数据的XGBoost模型在VUMC和MGB的加权auc分别为0.86 (95% CI, 0.84-0.88)和0.80 (95% CI, 0.79-0.80)。添加非结构化注释提高了性能,VUMC和MGB的加权auc分别为0.90 (95% CI, 0.88-0.93)和0.92 (95% CI, 0.91-0.92)。添加公开可用的数据并不能进一步提高性能。跨机构验证显示了显著的性能下降。讨论:集成结构化和非结构化电子病历数据的模型显示出强大的机构内部性能,但在医疗保健系统中的推广能力有限,突出了与机构数据异质性相关的挑战。结论:结合结构化和非结构化EHR数据的机器学习模型可以准确预测机构内的CoD,但跨站点的性能较差。卫生保健机构可能受益于采用适合当地情况的模型的稳健流程,未来的研究应侧重于提高模型的通用性,同时解决独特的机构数据环境问题。
{"title":"Enhancing cause of death prediction: development and validation of machine learning models using multimodal data across multiple health-care sites.","authors":"Mohammed Al-Garadi, Rishi J Desai, Kerry Ngan, Michele LeNoue-Newton, Ruth M Reeves, Daniel Park, Jose J Hernández-Muñoz, Shirley V Wang, Judith C Maro, Candace C Fuller, Joshua Lin Kueiyu, Aida Kuzucan, Kevin Coughlin, Haritha Pillai, Melissa McPheeters, Jill Whitaker, Jessica A Buckner, Michael F McLemore, Dax M Westerman, Michael E Matheny","doi":"10.1093/jamiaopen/ooaf175","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf175","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and validate machine learning (ML) models that predict probable cause of death (CoD) using structured electronic health record (EHR) data, unstructured clinical notes, and publicly available sources.</p><p><strong>Materials and methods: </strong>This multi-institutional retrospective study was conducted across Vanderbilt University Medical Center (VUMC) and Massachusetts General Brigham (MGB), including deceased patients with encounters between October 1, 2015, and January 1, 2021, and confirmed death records. The cohort included 13 708 patients from VUMC and 34 839 from MGB.The primary outcome was underlying CoD categorized into the top 15 National Center for Health Statistics rankable causes, with others grouped as \"Other.\" Performance was assessed using weighted area under the receiver operating characteristic curve (AUC) and F-measure.</p><p><strong>Results: </strong>The XGBoost model using structured EHR data alone achieved weighted AUCs of 0.86 (95% CI, 0.84-0.88) at VUMC and 0.80 (95% CI, 0.79-0.80) at MGB. Adding unstructured notes improved performance, with weighted AUCs of 0.90 (95% CI, 0.88-0.93) at VUMC and 0.92 (95% CI, 0.91-0.92) at MGB. Adding publicly available data did not further improve performance. Cross-institutional validation revealed significant performance degradation.</p><p><strong>Discussion: </strong>Models integrating structured and unstructured EHR data show strong within-institution performance but limited generalizability across healthcare systems, highlighting challenges related to institutional data heterogeneity.</p><p><strong>Conclusions: </strong>Machine learning models combining structured and unstructured EHR data accurately predict CoD within institutions but perform poorly across sites. Health-care institutions may benefit from adopting robust processes for locally tailored models, and future research should focus on enhancing model generalizability while addressing unique institutional data environments.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"9 1","pages":"ooaf175"},"PeriodicalIF":3.4,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12924636/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferring high-fat dietary patterns from electronic health record data using machine learning. 利用机器学习从电子健康记录数据推断高脂肪饮食模式。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2026-01-07 eCollection Date: 2026-02-01 DOI: 10.1093/jamiaopen/ooaf181
Ya-Yun Yeh, Hsin-Yueh Lin, Jingchuan Guo, Ramon C Sun, Sizun Jiang, Jiang Bian, Hao Dai

Objectives: Electronic health records (EHRs) rarely capture dietary detail, limiting diet-disease research. We aimed to develop machine learning (ML) computable phenotypes to identify high-fat diet (HFD) using variables typically available in EHRs.

Materials and methods: We used National Health and Nutrition Examination Survey (NHANES) 1999-2020 data, where 24-h dietary recall served as ground truth. Dietary fat intake was summarized into a score (0-30) based on percent energy from fat, carbohydrate, and protein; lower scores indicated HFD. We defined HFD at cutoffs of 10, 15, and 20, and trained ML models (Extreme Gradient Boosting, logistic regression, random forest) using EHR-compatible variables (demographics, comorbidities, labs, anthropometrics). Model interpretability was assessed using Shapley Additive Explanations. To evaluate clinical relevance, we compared cancer associations using ML-predicted vs true diet labels.

Results: Machine learning models classified HFD with good performance, strongest at broader definitions. Random forest achieved an F1-score of 0.79 (recall 0.74, precision 0.84) at cutoff 20. Key predictors included race/ethnicity, triglycerides, obesity metrics (body mass index and derived indices), and metabolic panel results.

Discussion: These findings indicate that dietary patterns, though seldom recorded in EHRs, can be inferred from routinely available variables. The ability of ML-derived phenotypes to reproduce known diet-disease relationships underscore their epidemiologic validity. Top predictors also align with established biological pathways linking obesity, lipid metabolism, and cancer risk, supporting plausibility.

Conclusion: A high-fat dietary pattern can be inferred from EHR-compatible variables using ML-based phenotyping. This approach offers a scalable tool to integrate diet into EHR-based research and precision medicine.

目的:电子健康记录(EHRs)很少记录饮食细节,限制了饮食疾病的研究。我们的目标是开发机器学习(ML)可计算表型,使用电子病历中通常可用的变量来识别高脂肪饮食(HFD)。材料和方法:我们使用1999-2020年国家健康和营养检查调查(NHANES)数据,其中24小时膳食召回作为基本事实。膳食脂肪摄入量以脂肪、碳水化合物和蛋白质的能量百分比为基础,总结成一个分数(0-30);分数越低表明HFD。我们在10、15和20的截断点定义HFD,并使用ehr兼容变量(人口统计学、合并症、实验室、人体测量学)训练ML模型(极端梯度增强、逻辑回归、随机森林)。采用Shapley加性解释评估模型可解释性。为了评估临床相关性,我们比较了使用ml预测和真实饮食标签的癌症关联。结果:机器学习模型对HFD的分类表现良好,在更广泛的定义上最强。随机森林在截止点20的f1得分为0.79(召回率0.74,精度0.84)。关键预测因素包括种族/民族、甘油三酯、肥胖指标(体重指数和衍生指数)和代谢小组结果。讨论:这些发现表明,饮食模式虽然很少记录在电子病历中,但可以从常规可用的变量中推断出来。ml衍生表型重现已知饮食-疾病关系的能力强调了其流行病学有效性。顶级预测指标还与已建立的生物途径相一致,这些途径与肥胖、脂质代谢和癌症风险有关,支持了其合理性。结论:利用基于ml的表型分析,可以从ehr相容变量推断出高脂肪饮食模式。这种方法提供了一种可扩展的工具,将饮食整合到基于电子病历的研究和精准医学中。
{"title":"Inferring high-fat dietary patterns from electronic health record data using machine learning.","authors":"Ya-Yun Yeh, Hsin-Yueh Lin, Jingchuan Guo, Ramon C Sun, Sizun Jiang, Jiang Bian, Hao Dai","doi":"10.1093/jamiaopen/ooaf181","DOIUrl":"10.1093/jamiaopen/ooaf181","url":null,"abstract":"<p><strong>Objectives: </strong>Electronic health records (EHRs) rarely capture dietary detail, limiting diet-disease research. We aimed to develop machine learning (ML) computable phenotypes to identify high-fat diet (HFD) using variables typically available in EHRs.</p><p><strong>Materials and methods: </strong>We used National Health and Nutrition Examination Survey (NHANES) 1999-2020 data, where 24-h dietary recall served as ground truth. Dietary fat intake was summarized into a score (0-30) based on percent energy from fat, carbohydrate, and protein; lower scores indicated HFD. We defined HFD at cutoffs of 10, 15, and 20, and trained ML models (Extreme Gradient Boosting, logistic regression, random forest) using EHR-compatible variables (demographics, comorbidities, labs, anthropometrics). Model interpretability was assessed using Shapley Additive Explanations. To evaluate clinical relevance, we compared cancer associations using ML-predicted vs true diet labels.</p><p><strong>Results: </strong>Machine learning models classified HFD with good performance, strongest at broader definitions. Random forest achieved an F1-score of 0.79 (recall 0.74, precision 0.84) at cutoff 20. Key predictors included race/ethnicity, triglycerides, obesity metrics (body mass index and derived indices), and metabolic panel results.</p><p><strong>Discussion: </strong>These findings indicate that dietary patterns, though seldom recorded in EHRs, can be inferred from routinely available variables. The ability of ML-derived phenotypes to reproduce known diet-disease relationships underscore their epidemiologic validity. Top predictors also align with established biological pathways linking obesity, lipid metabolism, and cancer risk, supporting plausibility.</p><p><strong>Conclusion: </strong>A high-fat dietary pattern can be inferred from EHR-compatible variables using ML-based phenotyping. This approach offers a scalable tool to integrate diet into EHR-based research and precision medicine.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"9 1","pages":"ooaf181"},"PeriodicalIF":3.4,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12794014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a risk factor framework to inform machine learning prediction of young people's mental health problems: a Delphi study. 开发一个风险因素框架,为年轻人心理健康问题的机器学习预测提供信息:德尔菲研究。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-23 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf166
Katherine Parkin, Ryan Crowley, Rachel Sippy, Shabina Hayat, Yi Zhang, Emily Brewis, Nicole Marshall, Tara Ramsay-Patel, Vahgisha Thirugnanasampanthan, Guy Skinner, Peter Fonagy, Carol Brayne, Anna Moore

Objectives: To create a theoretical framework of mental health risk factors to inform the development of prediction models for young people's mental health problems.

Materials and methods: We created an initial prototype theoretical framework using a rapid literature search and stakeholder discussion. A snowball sampling approach identified experts for the Delphi study. Round 1 sought consensus on the overall approach, framework domains, and life course stages. Round 2 aimed to establish the points in the life course where exposure to specific risk factors would be most influential. Round 3 ranked risk factors within domains by their predictive importance for young people's mental health problems.

Results: The final framework reached consensus after 3 rounds and included 287 risk factors across 8 domains and 5 life course stages. Twenty-five experts completed round 3. Domains ranked as most important were "Social and Environmental" and "Psychological and Mental Health." Ranked lists of risk factors within domains and heat maps showing the salience of risk factors across life course stages were generated.

Discussion: The study integrated multidisciplinary expert perspectives and prioritized health equity throughout the framework's development. The ranked risk factor lists and life stage heat maps support the targeted inclusion of risk factors across developmental stages in prediction models.

Conclusion: This theoretical framework provides a roadmap of important risk factors for inclusion in early identification models to enhance the predictive accuracy of childhood mental health problems. It offers a useful theoretical reference point to support model building for those without domain expertise.

目的:建立青少年心理健康危险因素的理论框架,为青少年心理健康问题预测模型的建立提供理论依据。材料和方法:我们使用快速文献检索和利益相关者讨论创建了一个初始原型理论框架。雪球抽样方法确定了德尔菲研究的专家。第一轮在总体方法、框架领域和生命历程阶段上寻求共识。第二轮旨在确定生命历程中暴露于特定风险因素影响最大的时间点。第三轮根据对年轻人心理健康问题的预测重要性对各领域的风险因素进行排名。结果:最终框架经过3轮协商达成共识,包括8个领域、5个生命历程阶段的287个危险因素。25位专家完成了第三轮。排名最重要的领域是“社会与环境”和“心理与精神健康”。生成了领域内风险因素的排名列表和热图,显示了生命过程中各个阶段风险因素的显著性。讨论:该研究综合了多学科专家的观点,并在整个框架的发展过程中优先考虑卫生公平。风险因素排名表和生命阶段热图支持在预测模型中有针对性地包括各发育阶段的风险因素。结论:该理论框架为将重要危险因素纳入早期识别模型提供了路线图,以提高儿童心理健康问题的预测准确性。它为那些没有领域专业知识的人提供了一个有用的理论参考点来支持模型的构建。
{"title":"Development of a risk factor framework to inform machine learning prediction of young people's mental health problems: a Delphi study.","authors":"Katherine Parkin, Ryan Crowley, Rachel Sippy, Shabina Hayat, Yi Zhang, Emily Brewis, Nicole Marshall, Tara Ramsay-Patel, Vahgisha Thirugnanasampanthan, Guy Skinner, Peter Fonagy, Carol Brayne, Anna Moore","doi":"10.1093/jamiaopen/ooaf166","DOIUrl":"10.1093/jamiaopen/ooaf166","url":null,"abstract":"<p><strong>Objectives: </strong>To create a theoretical framework of mental health risk factors to inform the development of prediction models for young people's mental health problems.</p><p><strong>Materials and methods: </strong>We created an initial prototype theoretical framework using a rapid literature search and stakeholder discussion. A snowball sampling approach identified experts for the Delphi study. Round 1 sought consensus on the overall approach, framework domains, and life course stages. Round 2 aimed to establish the points in the life course where exposure to specific risk factors would be most influential. Round 3 ranked risk factors within domains by their predictive importance for young people's mental health problems.</p><p><strong>Results: </strong>The final framework reached consensus after 3 rounds and included 287 risk factors across 8 domains and 5 life course stages. Twenty-five experts completed round 3. Domains ranked as most important were \"Social and Environmental\" and \"Psychological and Mental Health.\" Ranked lists of risk factors within domains and heat maps showing the salience of risk factors across life course stages were generated.</p><p><strong>Discussion: </strong>The study integrated multidisciplinary expert perspectives and prioritized health equity throughout the framework's development. The ranked risk factor lists and life stage heat maps support the targeted inclusion of risk factors across developmental stages in prediction models.</p><p><strong>Conclusion: </strong>This theoretical framework provides a roadmap of important risk factors for inclusion in early identification models to enhance the predictive accuracy of childhood mental health problems. It offers a useful theoretical reference point to support model building for those without domain expertise.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf166"},"PeriodicalIF":3.4,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12726920/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145828723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Higher electronic health record burden among women physicians in academic ambulatory medicine. 学术门诊女医生电子病历负担加重
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-17 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf164
Sarah Y Bessen, Sean Tackett, Kimberly S Peairs, Lisa Christopher-Stine, Charles M Stewart, Lee D Biddison, Maria Oliva-Hemker, Jennifer K Lee

Objectives: Electronic health record (EHR) work may differently affect women and men physicians. Identifying gender discrepancies in EHR work across different specialties may inform strategies to reduce EHR burdens.

Materials and methods: We retrospectively evaluated EHR use by ambulatory physicians in 4 specialties (2 procedural [cardiology and gastroenterology] and 2 nonprocedural [internal medicine and rheumatology]) during 1 year at a large academic medical institution. Gender differences in EHR and clinical workload across specialties were evaluated by analysis of variance. Mixed-effects linear regression models analyzed gender differences in EHR workload controlling for specialty. Significant differences were additionally examined by stratifying procedural and nonprocedural specialties.

Results: Clinical and EHR workload varied across specialties (P <.05), though scheduled clinical workload did not differ by gender. Controlling for specialty, women physicians spent more time per appointment on In Basket messages (P =.001), sent more Secure Chat messages per appointment (P =.003), and spent more time in the EHR outside 7:00 AM-7:00 PM (P <.001) than men. Gender differences in messaging were concentrated among the procedural physicians. Women procedural physicians spent more time on In Basket messages (P <.001) and sent more Secure Chat messages (P =.007) than men, whereas these differences did not occur among nonprocedural physicians.

Discussion: Women physicians had greater EHR burdens despite similar scheduled clinical workloads as men. The greater messaging workload predominantly affected women procedural physicians.

Conclusion: Gender disparities in EHR burden in ambulatory specialties vary between procedural and nonprocedural fields. Future research is needed to mitigate gender inequity in EHR workloads.

目的:电子健康记录(EHR)工作对女性和男性医生的影响可能不同。确定不同专业电子病历工作中的性别差异,可以为减轻电子病历负担的策略提供信息。材料和方法:我们回顾性评估了一家大型学术医疗机构4个专业(2个程序性[心脏病学和胃肠病学]和2个非程序性[内科和风湿病学])的门诊医生在1年内使用电子病历的情况。通过方差分析评估各专科在电子病历和临床工作量方面的性别差异。混合效应线性回归模型分析了性别在电子病历工作量控制方面的差异。此外,通过对程序性和非程序性专业进行分层来检验显著差异。结果:临床和电子病历工作量在不同专业之间存在差异(P =.001),每次预约发送更多的安全聊天信息(P =.003),并且在上午7点至下午7点以外花费更多的时间(P =.007),而这些差异在非程序性医生中没有发生。讨论:尽管计划的临床工作量与男性相似,但女性医生的电子病历负担更大。更大的信息传递工作量主要影响到妇女手术医生。结论:门诊专科电子病历负担的性别差异在程序性和非程序性领域存在差异。未来的研究需要减轻电子病历工作量中的性别不平等。
{"title":"Higher electronic health record burden among women physicians in academic ambulatory medicine.","authors":"Sarah Y Bessen, Sean Tackett, Kimberly S Peairs, Lisa Christopher-Stine, Charles M Stewart, Lee D Biddison, Maria Oliva-Hemker, Jennifer K Lee","doi":"10.1093/jamiaopen/ooaf164","DOIUrl":"10.1093/jamiaopen/ooaf164","url":null,"abstract":"<p><strong>Objectives: </strong>Electronic health record (EHR) work may differently affect women and men physicians. Identifying gender discrepancies in EHR work across different specialties may inform strategies to reduce EHR burdens.</p><p><strong>Materials and methods: </strong>We retrospectively evaluated EHR use by ambulatory physicians in 4 specialties (2 procedural [cardiology and gastroenterology] and 2 nonprocedural [internal medicine and rheumatology]) during 1 year at a large academic medical institution. Gender differences in EHR and clinical workload across specialties were evaluated by analysis of variance. Mixed-effects linear regression models analyzed gender differences in EHR workload controlling for specialty. Significant differences were additionally examined by stratifying procedural and nonprocedural specialties.</p><p><strong>Results: </strong>Clinical and EHR workload varied across specialties (<i>P</i> <.05), though scheduled clinical workload did not differ by gender. Controlling for specialty, women physicians spent more time per appointment on In Basket messages (<i>P</i> =.001), sent more Secure Chat messages per appointment (<i>P</i> =.003), and spent more time in the EHR outside 7:00 AM-7:00 PM (<i>P</i> <.001) than men. Gender differences in messaging were concentrated among the procedural physicians. Women procedural physicians spent more time on In Basket messages (<i>P</i> <.001) and sent more Secure Chat messages (<i>P</i> =.007) than men, whereas these differences did not occur among nonprocedural physicians.</p><p><strong>Discussion: </strong>Women physicians had greater EHR burdens despite similar scheduled clinical workloads as men. The greater messaging workload predominantly affected women procedural physicians.</p><p><strong>Conclusion: </strong>Gender disparities in EHR burden in ambulatory specialties vary between procedural and nonprocedural fields. Future research is needed to mitigate gender inequity in EHR workloads.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf164"},"PeriodicalIF":3.4,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12715314/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145805614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring common data model coverage of nursing flowsheet data: a pilot study using SNOMED CT and LOINC mapping. 探索护理流程数据的通用数据模型覆盖范围:使用SNOMED CT和LOINC映射的试点研究。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-14 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf168
Robin Austin, Malin Britt Lalich, Katy Stewart, Jonna Zarbano, Matthew Byrne, Melissa D Pinto, Elizabeth E Umberfield

Objectives: The primary objective of this research is to assess the content coverage of nursing data within a publicly available common data model (CDM), focusing on how nursing data, documented in flowsheets, are represented within the model.

Materials and methods: This mapping study was informed by previous evaluation studies and serves as a framework for evaluating information resources, including to guide development and implementation. The overall research process consists of 4 steps: (1) identify a CDM; (2) define evaluation criteria; (3) map nursing flowsheet data; and (4) apply evaluation criteria.

Results: Overall, 65.5% (n = 1170) of the flowsheet concepts were mapped to Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and Logical Observation Identifiers Names and Codes (LOINC) target codes and 56.0% (n = 1831) of the flowsheet values were mapped to SNOMED CT and LOINC target codes. The flowsheet concepts had a higher average mapping time per concept/reviewer (1.19 min) as compared to the average mapping time per value/reviewer (0.64 min).

Discussion: This mapping study demonstrated the progress and ongoing challenges of mapping nursing data to a national common data model. However, the ability to use nursing data at scale in a national CDM remains limited until more comprehensive mapping is completed.

Conclusion: This mapping study identifies a significant gap in integrating nursing data into a national common data model, highlighting an opportunity to enhance patient care through improved real-time insights and evidence-based nursing practices. Addressing this gap can help shape policies that prioritize the inclusion of nursing data. Additionally, aligning nursing data at scale can advance research, increase efficiency, and optimize nurse-sensitive patient outcomes.

目的:本研究的主要目的是评估公共数据模型(CDM)中护理数据的内容覆盖范围,重点关注以流程图记录的护理数据如何在模型中表示。材料和方法:这项测绘研究是根据以前的评估研究得出的,并作为评估信息资源的框架,包括指导开发和实施。整个研究过程包括4个步骤:(1)确定清洁发展机制;(2)确定评价标准;(3)绘制护理流程数据;(4)应用评价标准。结果:总体而言,65.5% (n = 1170)的流程图概念被映射到《医学临床术语系统化命名法》(SNOMED CT)和《逻辑观察标识名称与代码》(LOINC)目标代码中,56.0% (n = 1831)的流程图值被映射到SNOMED CT和LOINC目标代码中。与每个值/审阅者的平均映射时间(0.64分钟)相比,流程图概念具有更高的每个概念/审阅者的平均映射时间(1.19分钟)。讨论:该测绘研究展示了将护理数据映射到国家通用数据模型的进展和持续挑战。然而,在更全面的绘图完成之前,在国家清洁发展机制中大规模使用护理数据的能力仍然有限。结论:该测绘研究确定了将护理数据整合到国家通用数据模型中的重大差距,强调了通过改进实时洞察和循证护理实践来加强患者护理的机会。解决这一差距有助于制定优先纳入护理数据的政策。此外,大规模调整护理数据可以推进研究,提高效率,并优化护士敏感的患者结果。
{"title":"Exploring common data model coverage of nursing flowsheet data: a pilot study using SNOMED CT and LOINC mapping.","authors":"Robin Austin, Malin Britt Lalich, Katy Stewart, Jonna Zarbano, Matthew Byrne, Melissa D Pinto, Elizabeth E Umberfield","doi":"10.1093/jamiaopen/ooaf168","DOIUrl":"10.1093/jamiaopen/ooaf168","url":null,"abstract":"<p><strong>Objectives: </strong>The primary objective of this research is to assess the content coverage of nursing data within a publicly available common data model (CDM), focusing on how nursing data, documented in flowsheets, are represented within the model.</p><p><strong>Materials and methods: </strong>This mapping study was informed by previous evaluation studies and serves as a framework for evaluating information resources, including to guide development and implementation. The overall research process consists of 4 steps: (1) identify a CDM; (2) define evaluation criteria; (3) map nursing flowsheet data; and (4) apply evaluation criteria.</p><p><strong>Results: </strong>Overall, 65.5% (<i>n</i> = 1170) of the flowsheet concepts were mapped to Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and Logical Observation Identifiers Names and Codes (LOINC) target codes and 56.0% (<i>n</i> = 1831) of the flowsheet values were mapped to SNOMED CT and LOINC target codes. The flowsheet concepts had a higher average mapping time per concept/reviewer (1.19 min) as compared to the average mapping time per value/reviewer (0.64 min).</p><p><strong>Discussion: </strong>This mapping study demonstrated the progress and ongoing challenges of mapping nursing data to a national common data model. However, the ability to use nursing data at scale in a national CDM remains limited until more comprehensive mapping is completed.</p><p><strong>Conclusion: </strong>This mapping study identifies a significant gap in integrating nursing data into a national common data model, highlighting an opportunity to enhance patient care through improved real-time insights and evidence-based nursing practices. Addressing this gap can help shape policies that prioritize the inclusion of nursing data. Additionally, aligning nursing data at scale can advance research, increase efficiency, and optimize nurse-sensitive patient outcomes.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf168"},"PeriodicalIF":3.4,"publicationDate":"2025-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701890/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145763949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Response to survey directed to patient portal members differs by age, race, and healthcare utilization. 更正:针对患者门户网站成员的调查结果因年龄、种族和医疗保健利用情况而异。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-10 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf124

[This corrects the article DOI: 10.1093/jamiaopen/ooz061.].

[这更正了文章DOI: 10.1093/jamiaopen/ooz061.]。
{"title":"Correction to: Response to survey directed to patient portal members differs by age, race, and healthcare utilization.","authors":"","doi":"10.1093/jamiaopen/ooaf124","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf124","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/jamiaopen/ooz061.].</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf124"},"PeriodicalIF":3.4,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12706857/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated classification of exposure and encourage events in speech data from pediatric OCD treatment. 儿童强迫症治疗的语音数据中暴露和鼓励事件的自动分类。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-09 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf127
Juan Antonio Lossio-Ventura, Samuel Frank, Grace Ringlein, Kirsten Bonson, Ardyn Olszko, Abbey Knobel, Daniel S Pine, Jennifer B Freeman, Kristen Benito, David C Jangraw, Francisco Pereira

Objective: To develop and evaluate an automated classification system for labeling Exposure Process Coding System (EPCS) quality codes-specifically exposure and encourage events-during in-person exposure therapy sessions using automatic speech recognition (ASR) and natural language processing techniques.

Materials and methods: The system was trained and tested on 360 manually labeled pediatric Obsessive-Compulsive Disorder (OCD) therapy sessions from 3 clinical trials. Audio recordings were transcribed using ASR tools (OpenAI's Whisper and Google Speech-to-Text). Transcription accuracy was evaluated via word error rate (WER) on manual transcriptions of 2-minute audio segments compared against ASR-generated transcripts. The resulting text was analyzed with transformer-based models, including Bidirectional Encoder Representations from Transformers (BERT), Sentence-BERT, and Meta Llama 3. Models were trained to predict EPCS codes in 2 classification settings: sequence-level classification, where events are labeled in delimited text chunks, and token-level classification, where event boundaries are unknown. Classification was performed either with fine-tuned transformer-based models, or with logistic regression on embeddings produced by each model.

Results: With respect to transcription accuracy, Whisper outperformed Google Speech-to-Text with a lower WER (0.31 vs 0.51). For sequence classification setting, Llama 3 models achieved high performance with area under the ROC curve (AUC) scores of 0.95 for exposures and 0.75 for encourage events, outperforming traditional methods and standard BERT models. In the token-level setting, fine-tuned BERT models performed best, achieving AUC scores of 0.85 for exposures and 0.75 for encourage events.

Discussion and conclusion: Current ASR and transformer-based models enable automated quality coding of in-person exposure therapy sessions. These findings demonstrate potential for real-time assessment in clinical practice and scalable research on effective therapy methods. Future work should focus on optimization, including improvements in ASR accuracy, expanding training datasets, and multimodal data integration.

目的:利用自动语音识别(ASR)和自然语言处理技术,开发和评估一种用于标记暴露过程编码系统(EPCS)质量代码(特别是暴露和鼓励事件)的自动分类系统。材料与方法:对该系统进行了3个临床试验的360个手动标记的儿童强迫症(OCD)治疗疗程的训练和测试。使用ASR工具(OpenAI的Whisper和谷歌Speech-to-Text)转录音频记录。转录准确性通过人工转录2分钟音频片段的单词错误率(WER)与asr生成的转录进行比较。结果文本使用基于变压器的模型进行分析,包括来自变压器的双向编码器表示(BERT)、句子-BERT和Meta Llama 3。训练模型在两种分类设置下预测EPCS代码:序列级分类,其中事件在分隔的文本块中标记,以及标记级分类,其中事件边界未知。通过微调变压器模型或对每个模型产生的嵌入进行逻辑回归进行分类。结果:在转录准确性方面,Whisper优于谷歌Speech-to-Text, WER较低(0.31 vs 0.51)。在序列分类设置方面,Llama 3模型的ROC曲线下面积(area under ROC curve, AUC)得分在曝光和鼓励事件下分别为0.95和0.75,优于传统方法和标准BERT模型。在令牌级别设置中,微调BERT模型表现最佳,暴露的AUC得分为0.85,鼓励事件的AUC得分为0.75。讨论和结论:当前的ASR和基于变压器的模型能够实现面对面暴露治疗过程的自动质量编码。这些发现显示了在临床实践和有效治疗方法的可扩展研究中进行实时评估的潜力。未来的工作应侧重于优化,包括提高ASR的准确性、扩展训练数据集和多模态数据集成。
{"title":"Automated classification of exposure and encourage events in speech data from pediatric OCD treatment.","authors":"Juan Antonio Lossio-Ventura, Samuel Frank, Grace Ringlein, Kirsten Bonson, Ardyn Olszko, Abbey Knobel, Daniel S Pine, Jennifer B Freeman, Kristen Benito, David C Jangraw, Francisco Pereira","doi":"10.1093/jamiaopen/ooaf127","DOIUrl":"10.1093/jamiaopen/ooaf127","url":null,"abstract":"<p><strong>Objective: </strong>To develop and evaluate an automated classification system for labeling Exposure Process Coding System (EPCS) quality codes-specifically exposure and encourage events-during in-person exposure therapy sessions using automatic speech recognition (ASR) and natural language processing techniques.</p><p><strong>Materials and methods: </strong>The system was trained and tested on 360 manually labeled pediatric Obsessive-Compulsive Disorder (OCD) therapy sessions from 3 clinical trials. Audio recordings were transcribed using ASR tools (OpenAI's Whisper and Google Speech-to-Text). Transcription accuracy was evaluated via word error rate (WER) on manual transcriptions of 2-minute audio segments compared against ASR-generated transcripts. The resulting text was analyzed with transformer-based models, including Bidirectional Encoder Representations from Transformers (BERT), Sentence-BERT, and Meta Llama 3. Models were trained to predict EPCS codes in 2 classification settings: sequence-level classification, where events are labeled in delimited text chunks, and token-level classification, where event boundaries are unknown. Classification was performed either with fine-tuned transformer-based models, or with logistic regression on embeddings produced by each model.</p><p><strong>Results: </strong>With respect to transcription accuracy, Whisper outperformed Google Speech-to-Text with a lower WER (0.31 vs 0.51). For sequence classification setting, Llama 3 models achieved high performance with area under the ROC curve (AUC) scores of 0.95 for exposures and 0.75 for encourage events, outperforming traditional methods and standard BERT models. In the token-level setting, fine-tuned BERT models performed best, achieving AUC scores of 0.85 for exposures and 0.75 for encourage events.</p><p><strong>Discussion and conclusion: </strong>Current ASR and transformer-based models enable automated quality coding of in-person exposure therapy sessions. These findings demonstrate potential for real-time assessment in clinical practice and scalable research on effective therapy methods. Future work should focus on optimization, including improvements in ASR accuracy, expanding training datasets, and multimodal data integration.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf127"},"PeriodicalIF":3.4,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696644/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145757883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing natural language processing to identify cancer-relevant publications at a National Cancer Institute-designated cancer center. 利用自然语言处理在国家癌症研究所指定的癌症中心识别癌症相关出版物。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-09 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf156
Whitney Shae, Md Saiful Islam Saif, John Fife, Dinesh Pal Mudaranthakam, Dong Pei, Lisa Harlan-Williams, Jeffrey A Thompson, Devin C Koestler

Objectives: The objective of this study was to develop and test natural language processing (NLP) methods for screening and, ultimately, predicting the cancer relevance of peer-reviewed publications.

Materials and methods: Two datasets were used: (1) manually curated publications labeled for cancer relevance, co-authored by members of The University of Kansas Cancer Center (KUCC) and (2) a derived dataset containing cancer-related abstracts from American Association for Cancer Research journals and noncancer-related abstracts from other medical journals. Two text encoding methods were explored: term frequency-inverse document frequency (TF-IDF) vectorization and various BERT embeddings. These representations served as inputs to 3 supervised machine learning classifiers: Support Vector Classification (SVC), Gradient Boosting Classification, and Multilayer Perceptron (MLP) neural networks. Model performance was evaluated by comparing predictions to the "true" cancer-relevant labels in a withheld test set.

Results: All machine learning models performed best when trained and tested within the derived dataset. Across the datasets, SVC and MLP both exhibited strong performance, with F1 scores as high as 0.976 and 0.997, respectively. BioBERT embeddings resulted in slightly higher metrics when compared to TF-IDF vectorization across most models.

Discussion: Models trained on the derived data performed very well internally; however, weaker performance was noted when these models were tested on the KUCC dataset. This finding highlights the subjective nature of cancer-relevant determinations. In contrast, KUCC trained models had high predictive performance when tested on the derived-specific classifications, showing that models trained on the KUCC dataset may be suitable for wider cancer-relevant prediction.

Conclusions: Overall, our results suggest that NLP can effectively automate the classification of cancer-relevant publications, enhancing research productivity tracking; however, great care should be taken in selecting the appropriate data, text representation approach, and machine learning approach.

目的:本研究的目的是开发和测试用于筛选和最终预测同行评审出版物的癌症相关性的自然语言处理(NLP)方法。材料和方法:使用了两个数据集:(1)由堪萨斯大学癌症中心(KUCC)成员共同撰写的标记为癌症相关的人工整理出版物;(2)包含来自美国癌症研究协会期刊的癌症相关摘要和来自其他医学期刊的非癌症相关摘要的衍生数据集。研究了两种文本编码方法:词频逆文档频率(TF-IDF)矢量化和各种BERT嵌入。这些表示作为3个监督机器学习分类器的输入:支持向量分类(SVC),梯度增强分类和多层感知器(MLP)神经网络。通过将预测结果与保留测试集中的“真实”癌症相关标签进行比较,来评估模型的性能。结果:所有机器学习模型在派生数据集中进行训练和测试时表现最佳。在所有数据集中,SVC和MLP均表现出较强的性能,F1得分分别高达0.976和0.997。在大多数模型中,与TF-IDF矢量化相比,BioBERT嵌入的指标略高。讨论:在导出数据上训练的模型在内部表现非常好;然而,当这些模型在KUCC数据集上进行测试时,发现性能较差。这一发现强调了癌症相关决定的主观性。相比之下,在对衍生的特定分类进行测试时,KUCC训练的模型具有较高的预测性能,这表明在KUCC数据集上训练的模型可能适用于更广泛的癌症相关预测。结论:总体而言,我们的研究结果表明,NLP可以有效地自动化癌症相关出版物的分类,增强研究生产力跟踪;然而,在选择合适的数据、文本表示方法和机器学习方法时应该非常小心。
{"title":"Utilizing natural language processing to identify cancer-relevant publications at a National Cancer Institute-designated cancer center.","authors":"Whitney Shae, Md Saiful Islam Saif, John Fife, Dinesh Pal Mudaranthakam, Dong Pei, Lisa Harlan-Williams, Jeffrey A Thompson, Devin C Koestler","doi":"10.1093/jamiaopen/ooaf156","DOIUrl":"10.1093/jamiaopen/ooaf156","url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this study was to develop and test natural language processing (NLP) methods for screening and, ultimately, predicting the cancer relevance of peer-reviewed publications.</p><p><strong>Materials and methods: </strong>Two datasets were used: (1) manually curated publications labeled for cancer relevance, co-authored by members of The University of Kansas Cancer Center (KUCC) and (2) a derived dataset containing cancer-related abstracts from American Association for Cancer Research journals and noncancer-related abstracts from other medical journals. Two text encoding methods were explored: term frequency-inverse document frequency (TF-IDF) vectorization and various BERT embeddings. These representations served as inputs to 3 supervised machine learning classifiers: Support Vector Classification (SVC), Gradient Boosting Classification, and Multilayer Perceptron (MLP) neural networks. Model performance was evaluated by comparing predictions to the \"true\" cancer-relevant labels in a withheld test set.</p><p><strong>Results: </strong>All machine learning models performed best when trained and tested within the derived dataset. Across the datasets, SVC and MLP both exhibited strong performance, with F1 scores as high as 0.976 and 0.997, respectively. BioBERT embeddings resulted in slightly higher metrics when compared to TF-IDF vectorization across most models.</p><p><strong>Discussion: </strong>Models trained on the derived data performed very well internally; however, weaker performance was noted when these models were tested on the KUCC dataset. This finding highlights the subjective nature of cancer-relevant determinations. In contrast, KUCC trained models had high predictive performance when tested on the derived-specific classifications, showing that models trained on the KUCC dataset may be suitable for wider cancer-relevant prediction.</p><p><strong>Conclusions: </strong>Overall, our results suggest that NLP can effectively automate the classification of cancer-relevant publications, enhancing research productivity tracking; however, great care should be taken in selecting the appropriate data, text representation approach, and machine learning approach.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf156"},"PeriodicalIF":3.4,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12696645/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145757887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal feature analysis for automated neonatal jaundice assessment using machine learning. 使用机器学习进行新生儿黄疸自动评估的多模态特征分析。
IF 3.4 Q2 HEALTH CARE SCIENCES & SERVICES Pub Date : 2025-12-06 eCollection Date: 2025-12-01 DOI: 10.1093/jamiaopen/ooaf165
Yunfeng Liang, Lin Zou, Millie Ming Rong Goh, Alvin Jia Hao Ngeow, Ngiap Chuan Tan, Andy Wee An Ta, Han Leong Goh

Objective: Neonatal jaundice monitoring is resource-intensive. Existing artificial intelligence methods use image or clinical data, but none systematically combine both or compare feature contributions. This study fills that gap by extracting and analyzing multimodal features on a large dataset, identifying an optimal feature set for accurate, accessible jaundice assessment.

Materials and methods: This study collected clinical data and skin images from 3 body regions of 633 neonates, generating 460 features across 4 categories. Four tree-based models were used to predict total serum bilirubin levels and feature importance analysis guided the selection of an optimal feature set.

Results: The optimal performance was achieved using the Light Gradient Boosting Machine (LGBM) model with 140 selected features, yielding a root mean square error (RMSE) of 2.0477 mg/dL and a Pearson correlation of 0.8435. This represents a performance gain of over 10% in RMSE compared to models using only a single data modality. Moreover, selecting the top 30 features based on SHapley Additive exPlanation (SHAP) allows for a substantial reduction in data dimensionality, while maintaining performance within 5% of the optimal model.

Discussion: Color features contributed over 60% of the total importance, with clinical data adding more than 25%, led by hour of life. Light temperature also affected predictions, while texture features had minimal impact. Among body regions, the abdomen provided the most informative signals for jaundice severity.

Conclusion: The proposed algorithm shows promise for real-world use by enabling timely, automated jaundice assessment for families, while also offering insights for future research and broader medical applications.

目的:新生儿黄疸监测是资源密集型的。现有的人工智能方法使用图像或临床数据,但没有系统地结合两者或比较特征贡献。本研究通过在大型数据集上提取和分析多模态特征来填补这一空白,为准确、可访问的黄疸评估确定最佳特征集。材料与方法:本研究收集633例新生儿3个身体区域的临床资料和皮肤图像,生成4类460个特征。四种基于树的模型用于预测血清总胆红素水平,特征重要性分析指导了最佳特征集的选择。结果:选择140个特征的光梯度增强机(LGBM)模型获得最佳性能,其均方根误差(RMSE)为2.0477 mg/dL, Pearson相关系数为0.8435。这表示与仅使用单一数据模式的模型相比,RMSE的性能提高了10%以上。此外,基于SHapley加性解释(SHAP)选择前30个特征,可以大幅降低数据维度,同时将性能保持在最优模型的5%以内。讨论:颜色特征占总重要性的60%以上,临床数据占25%以上,以生命小时为单位。光照温度也会影响预测,而纹理特征的影响最小。在身体区域中,腹部提供了黄疸严重程度的最信息信号。结论:该算法有望在现实世界中使用,为家庭提供及时、自动的黄疸评估,同时也为未来的研究和更广泛的医疗应用提供了见解。
{"title":"Multimodal feature analysis for automated neonatal jaundice assessment using machine learning.","authors":"Yunfeng Liang, Lin Zou, Millie Ming Rong Goh, Alvin Jia Hao Ngeow, Ngiap Chuan Tan, Andy Wee An Ta, Han Leong Goh","doi":"10.1093/jamiaopen/ooaf165","DOIUrl":"10.1093/jamiaopen/ooaf165","url":null,"abstract":"<p><strong>Objective: </strong>Neonatal jaundice monitoring is resource-intensive. Existing artificial intelligence methods use image or clinical data, but none systematically combine both or compare feature contributions. This study fills that gap by extracting and analyzing multimodal features on a large dataset, identifying an optimal feature set for accurate, accessible jaundice assessment.</p><p><strong>Materials and methods: </strong>This study collected clinical data and skin images from 3 body regions of 633 neonates, generating 460 features across 4 categories. Four tree-based models were used to predict total serum bilirubin levels and feature importance analysis guided the selection of an optimal feature set.</p><p><strong>Results: </strong>The optimal performance was achieved using the Light Gradient Boosting Machine (LGBM) model with 140 selected features, yielding a root mean square error (RMSE) of 2.0477 mg/dL and a Pearson correlation of 0.8435. This represents a performance gain of over 10% in RMSE compared to models using only a single data modality. Moreover, selecting the top 30 features based on SHapley Additive exPlanation (SHAP) allows for a substantial reduction in data dimensionality, while maintaining performance within 5% of the optimal model.</p><p><strong>Discussion: </strong>Color features contributed over 60% of the total importance, with clinical data adding more than 25%, led by hour of life. Light temperature also affected predictions, while texture features had minimal impact. Among body regions, the abdomen provided the most informative signals for jaundice severity.</p><p><strong>Conclusion: </strong>The proposed algorithm shows promise for real-world use by enabling timely, automated jaundice assessment for families, while also offering insights for future research and broader medical applications.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf165"},"PeriodicalIF":3.4,"publicationDate":"2025-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12687590/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JAMIA Open
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1