首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
Improving clinical decision support through interpretable machine learning and error handling in electronic health records. 通过可解释的机器学习和电子健康记录中的错误处理改进临床决策支持。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf058
Mehak Arora, Hassan Mortagy, Nathan Dwarshuis, Jeffrey Wang, Philip Yang, Andre L Holder, Swati Gupta, Rishikesan Kamaleswaran

Objective: To develop an electronic medical record (EMR) data processing tool that confers clinical context to machine learning (ML) algorithms for error handling, bias mitigation, and interpretability.

Materials and methods: We present Trust-MAPS, an algorithm that translates clinical domain knowledge into high-dimensional, mixed-integer programming models that capture physiological and biological constraints on clinical measurements. EMR data are projected onto this constrained space, effectively bringing outliers to fall within a physiologically feasible range. We then compute the distance of each data point from the constrained space modeling healthy physiology to quantify deviation from the norm. These distances, termed "trust-scores," are integrated into the feature space for downstream ML applications. We demonstrate the utility of Trust-MAPS by training a binary classifier for early sepsis prediction on data from the 2019 PhysioNet Computing in Cardiology Challenge, using the XGBoost algorithm and applying SMOTE for overcoming class-imbalance.

Results: The Trust-MAPS framework shows desirable behavior in handling potential errors and boosting predictive performance. We achieve an area under the receiver operating characteristic curve of 0.91 (95% CI, 0.89-0.92) for predicting sepsis 6 hours before onset-a marked 15% improvement over a baseline model trained without Trust-MAPS.

Discussions: Downstream classification performance improves after Trust-MAPS preprocessing, highlighting the bias reducing capabilities of the error-handling projections. Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks but also lend interpretability to ML models.

Conclusion: This work is the first to translate clinical domain knowledge into mathematical constraints, model cross-vital dependencies, and identify aberrations in high-dimensional medical data. Our method allows for error handling in EMR and confers interpretability and superior predictive power to models trained for clinical decision support.

目的:开发一种电子病历(EMR)数据处理工具,将临床背景赋予机器学习(ML)算法,用于错误处理、偏见缓解和可解释性。材料和方法:我们提出Trust-MAPS,这是一种将临床领域知识转化为高维混合整数规划模型的算法,可以捕获临床测量的生理和生物限制。EMR数据被投射到这个受限的空间,有效地将异常值置于生理上可行的范围内。然后,我们计算每个数据点与健康生理模型约束空间的距离,以量化与规范的偏差。这些距离被称为“信任分数”,被集成到下游ML应用程序的特征空间中。我们利用2019年PhysioNet Computing in Cardiology Challenge的数据,训练一个用于早期败血症预测的二元分类器,并使用XGBoost算法和SMOTE来克服类别不平衡,从而展示了Trust-MAPS的实用性。结果:Trust-MAPS框架在处理潜在错误和提高预测性能方面表现出理想的行为。在发病前6小时预测败血症时,我们实现了受试者工作特征曲线下的面积为0.91 (95% CI, 0.89-0.92)——与未经Trust-MAPS训练的基线模型相比,显著提高了15%。讨论:在Trust-MAPS预处理后,下游分类性能得到改善,突出了错误处理预测的减少偏差的能力。信任分数作为临床有意义的特征出现,不仅提高了临床决策支持任务的预测性能,而且为ML模型提供了可解释性。结论:这项工作首次将临床领域知识转化为数学约束,建立跨生命依赖关系模型,并识别高维医疗数据中的畸变。我们的方法允许在电子病历中的错误处理,并赋予可解释性和卓越的预测能力模型训练临床决策支持。
{"title":"Improving clinical decision support through interpretable machine learning and error handling in electronic health records.","authors":"Mehak Arora, Hassan Mortagy, Nathan Dwarshuis, Jeffrey Wang, Philip Yang, Andre L Holder, Swati Gupta, Rishikesan Kamaleswaran","doi":"10.1093/jamia/ocaf058","DOIUrl":"10.1093/jamia/ocaf058","url":null,"abstract":"<p><strong>Objective: </strong>To develop an electronic medical record (EMR) data processing tool that confers clinical context to machine learning (ML) algorithms for error handling, bias mitigation, and interpretability.</p><p><strong>Materials and methods: </strong>We present Trust-MAPS, an algorithm that translates clinical domain knowledge into high-dimensional, mixed-integer programming models that capture physiological and biological constraints on clinical measurements. EMR data are projected onto this constrained space, effectively bringing outliers to fall within a physiologically feasible range. We then compute the distance of each data point from the constrained space modeling healthy physiology to quantify deviation from the norm. These distances, termed \"trust-scores,\" are integrated into the feature space for downstream ML applications. We demonstrate the utility of Trust-MAPS by training a binary classifier for early sepsis prediction on data from the 2019 PhysioNet Computing in Cardiology Challenge, using the XGBoost algorithm and applying SMOTE for overcoming class-imbalance.</p><p><strong>Results: </strong>The Trust-MAPS framework shows desirable behavior in handling potential errors and boosting predictive performance. We achieve an area under the receiver operating characteristic curve of 0.91 (95% CI, 0.89-0.92) for predicting sepsis 6 hours before onset-a marked 15% improvement over a baseline model trained without Trust-MAPS.</p><p><strong>Discussions: </strong>Downstream classification performance improves after Trust-MAPS preprocessing, highlighting the bias reducing capabilities of the error-handling projections. Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks but also lend interpretability to ML models.</p><p><strong>Conclusion: </strong>This work is the first to translate clinical domain knowledge into mathematical constraints, model cross-vital dependencies, and identify aberrations in high-dimensional medical data. Our method allows for error handling in EMR and confers interpretability and superior predictive power to models trained for clinical decision support.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"123-132"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144003672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A perspective on individualized treatment effects estimation from time-series health data. 从时间序列健康数据估计个体化治疗效果的视角。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae323
Ghadeer O Ghosheh, Moritz Gögl, Tingting Zhu

Objectives: The objective of this study is to provide an overview of the current landscape of individualized treatment effects (ITE) estimation, specifically focusing on methodologies proposed for time-series electronic health records (EHRs). We aim to identify gaps in the literature, discuss challenges, and propose future research directions to advance the field of personalized medicine.

Materials and methods: We conducted a comprehensive literature review to identify and analyze relevant works on ITE estimation for time-series data. The review focused on theoretical assumptions, types of treatment settings, and computational frameworks employed in the existing literature.

Results: The literature reveals a growing body of work on ITE estimation for tabular data, while methodologies specific to time-series EHRs are limited. We summarize and discuss the latest advancements, including the types of models proposed, the theoretical foundations, and the computational approaches used.

Discussion: The limitations and challenges of current ITE estimation methods for time-series data are discussed, including the lack of standardized evaluation metrics and the need for more diverse and representative datasets. We also highlight considerations and potential biases that may arise in personalized treatment effect estimation.

Conclusion: This work provides a comprehensive overview of ITE estimation for time-series EHR data, offering insights into the current state of the field and identifying future research directions. By addressing the limitations and challenges, we hope to encourage further exploration and innovation in this exciting and under-studied area of personalized medicine.

目的:本研究的目的是概述个体化治疗效果(ITE)评估的现状,特别关注时间序列电子健康记录(EHRs)提出的方法。我们的目标是找出文献中的差距,讨论挑战,并提出未来的研究方向,以推进个性化医疗领域。材料和方法:我们进行了全面的文献综述,以识别和分析时间序列数据的ITE估计的相关工作。回顾的重点是理论假设,治疗设置的类型,并在现有文献中采用计算框架。结果:文献显示,越来越多的工作对表格数据进行ITE估计,而特定于时间序列电子病历的方法是有限的。我们总结和讨论了最新的进展,包括提出的模型类型、理论基础和使用的计算方法。讨论:讨论了当前时间序列数据的ITE估计方法的局限性和挑战,包括缺乏标准化的评估指标和需要更多样化和更具代表性的数据集。我们还强调了个性化治疗效果估计中可能出现的注意事项和潜在偏差。结论:本工作对时间序列EHR数据的ITE估计进行了全面概述,为该领域的现状提供了见解,并确定了未来的研究方向。通过解决局限性和挑战,我们希望鼓励在个性化医疗这一令人兴奋和研究不足的领域进行进一步的探索和创新。
{"title":"A perspective on individualized treatment effects estimation from time-series health data.","authors":"Ghadeer O Ghosheh, Moritz Gögl, Tingting Zhu","doi":"10.1093/jamia/ocae323","DOIUrl":"10.1093/jamia/ocae323","url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this study is to provide an overview of the current landscape of individualized treatment effects (ITE) estimation, specifically focusing on methodologies proposed for time-series electronic health records (EHRs). We aim to identify gaps in the literature, discuss challenges, and propose future research directions to advance the field of personalized medicine.</p><p><strong>Materials and methods: </strong>We conducted a comprehensive literature review to identify and analyze relevant works on ITE estimation for time-series data. The review focused on theoretical assumptions, types of treatment settings, and computational frameworks employed in the existing literature.</p><p><strong>Results: </strong>The literature reveals a growing body of work on ITE estimation for tabular data, while methodologies specific to time-series EHRs are limited. We summarize and discuss the latest advancements, including the types of models proposed, the theoretical foundations, and the computational approaches used.</p><p><strong>Discussion: </strong>The limitations and challenges of current ITE estimation methods for time-series data are discussed, including the lack of standardized evaluation metrics and the need for more diverse and representative datasets. We also highlight considerations and potential biases that may arise in personalized treatment effect estimation.</p><p><strong>Conclusion: </strong>This work provides a comprehensive overview of ITE estimation for time-series EHR data, offering insights into the current state of the field and identifying future research directions. By addressing the limitations and challenges, we hope to encourage further exploration and innovation in this exciting and under-studied area of personalized medicine.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"234-241"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DySurv: dynamic deep learning model for survival analysis with conditional variational inference. DySurv:利用条件变异推理进行生存分析的动态深度学习模型。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae271
Munib Mesinovic, Peter Watkinson, Tingting Zhu

Objective: Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically.

Materials and methods: DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV).

Results: DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets.

Discussion: Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.

Conclusion: While our method leverages non-parametric extensions to deep learning-guided estimations of the survival distribution, further deep learning paradigms could be explored.

目的:纵向电子健康记录的机器学习应用通常预测固定时间点的事件风险,而生存分析则通过估计时间到事件的分布来实现动态风险预测。在此,我们提出了一种基于条件变异自动编码器的新型方法 DySurv,它结合使用电子健康记录中的静态和纵向测量值来动态估计个体的死亡风险:DySurv 可直接估算累积风险发生函数,而无需对事件发生时间的基本随机过程做出任何参数假设。我们在医疗保健领域的 6 个时间到事件基准数据集以及从 eICU 协作研究(eICU)和重症监护医疗信息市场数据库(MIMIC-IV)中提取的 2 个真实重症监护病房(ICU)电子健康记录(EHR)数据集上对 DySurv 进行了评估:DySurv在时间到事件分析的一致性和其他指标方面优于其他现有的统计和深度学习方法。在 eICU 病例中,它实现了超过 60% 的时间相关一致性。它的准确性和灵敏度也比使用中的 ICU 评分(如急性生理学和慢性健康评估(APACHE)和序贯器官衰竭评估(SOFA)评分)高出 12% 和 22%。DySurv 的预测能力是一致的,在不同的数据集上,存活率估计值仍然是不同的:我们的跨学科框架成功地将深度学习、生存分析和重症监护结合在一起,创建了一种从纵向健康记录中进行时间到事件预测的新方法。我们在来自各种医疗数据集的多个保留测试集上测试了我们的方法,并将其与现有的在用临床风险评分基准进行了比较:结论:虽然我们的方法利用了深度学习引导的生存分布估计的非参数扩展,但还可以探索更多的深度学习范式。
{"title":"DySurv: dynamic deep learning model for survival analysis with conditional variational inference.","authors":"Munib Mesinovic, Peter Watkinson, Tingting Zhu","doi":"10.1093/jamia/ocae271","DOIUrl":"10.1093/jamia/ocae271","url":null,"abstract":"<p><strong>Objective: </strong>Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically.</p><p><strong>Materials and methods: </strong>DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV).</p><p><strong>Results: </strong>DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets.</p><p><strong>Discussion: </strong>Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.</p><p><strong>Conclusion: </strong>While our method leverages non-parametric extensions to deep learning-guided estimations of the survival distribution, further deep learning paradigms could be explored.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"112-122"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
myAURA: a personalized health library for epilepsy management via knowledge graph sparsification and visualization. myAURA:通过知识图谱稀疏化和可视化,为癫痫管理提供个性化的健康图书馆。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf012
Rion Brattig Correia, Jordan C Rozum, Leonard Cross, Jack Felag, Michael Gallant, Ziqi Guo, Bruce W Herr, Aehong Min, Jon Sanchez-Valle, Deborah Stungis Rocha, Alfonso Valencia, Xuan Wang, Katy Börner, Wendy Miller, Luis M Rocha

Objectives: Report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and clinicians in making decisions about self-management and care.

Materials and methods: myAURA rests on an unprecedented collection of epilepsy-relevant heterogeneous data resources, such as biomedical databases, social media, and electronic health records (EHRs). We use a patient-centered biomedical dictionary to link the collected data in a multilayer knowledge graph (KG) computed with a generalizable, open-source methodology.

Results: Our approach is based on a novel network sparsification method that uses the metric backbone of weighted graphs to discover important edges for inference, recommendation, and visualization. We demonstrate by studying drug-drug interaction from EHRs, extracting epilepsy-focused digital cohorts from social media, and generating a multilayer KG visualization. We also present our patient-centered design and pilot-testing of myAURA, including its user interface.

Discussion: The ability to search and explore myAURA's heterogeneous data sources in a single, sparsified, multilayer KG is highly useful for a range of epilepsy studies and stakeholder support.

Conclusion: Our stakeholder-driven, scalable approach to integrating traditional and nontraditional data sources enables both clinical discovery and data-powered patient self-management in epilepsy and can be generalized to other chronic conditions.

目的:报告以患者为中心的myAURA应用程序和一套方法的发展,旨在帮助癫痫患者、护理人员和临床医生做出自我管理和护理的决策。材料和方法:myAURA依赖于前所未有的与癫痫相关的异构数据资源,如生物医学数据库、社交媒体和电子健康记录(EHRs)。我们使用以患者为中心的生物医学词典,将收集到的数据链接到一个多层知识图(KG)中,该知识图采用可推广的开源方法计算。结果:我们的方法基于一种新颖的网络稀疏化方法,该方法使用加权图的度量主干来发现用于推理、推荐和可视化的重要边缘。我们通过从电子病历中研究药物-药物相互作用,从社交媒体中提取以癫痫为中心的数字队列,并生成多层KG可视化来证明。我们还介绍了以患者为中心的设计和myAURA的试点测试,包括它的用户界面。讨论:在单一、稀疏、多层KG中搜索和探索myAURA异构数据源的能力对一系列癫痫研究和利益相关者支持非常有用。结论:我们的利益相关者驱动、可扩展的方法整合了传统和非传统数据源,使癫痫的临床发现和数据驱动的患者自我管理成为可能,并可推广到其他慢性疾病。
{"title":"myAURA: a personalized health library for epilepsy management via knowledge graph sparsification and visualization.","authors":"Rion Brattig Correia, Jordan C Rozum, Leonard Cross, Jack Felag, Michael Gallant, Ziqi Guo, Bruce W Herr, Aehong Min, Jon Sanchez-Valle, Deborah Stungis Rocha, Alfonso Valencia, Xuan Wang, Katy Börner, Wendy Miller, Luis M Rocha","doi":"10.1093/jamia/ocaf012","DOIUrl":"10.1093/jamia/ocaf012","url":null,"abstract":"<p><strong>Objectives: </strong>Report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and clinicians in making decisions about self-management and care.</p><p><strong>Materials and methods: </strong>myAURA rests on an unprecedented collection of epilepsy-relevant heterogeneous data resources, such as biomedical databases, social media, and electronic health records (EHRs). We use a patient-centered biomedical dictionary to link the collected data in a multilayer knowledge graph (KG) computed with a generalizable, open-source methodology.</p><p><strong>Results: </strong>Our approach is based on a novel network sparsification method that uses the metric backbone of weighted graphs to discover important edges for inference, recommendation, and visualization. We demonstrate by studying drug-drug interaction from EHRs, extracting epilepsy-focused digital cohorts from social media, and generating a multilayer KG visualization. We also present our patient-centered design and pilot-testing of myAURA, including its user interface.</p><p><strong>Discussion: </strong>The ability to search and explore myAURA's heterogeneous data sources in a single, sparsified, multilayer KG is highly useful for a range of epilepsy studies and stakeholder support.</p><p><strong>Conclusion: </strong>Our stakeholder-driven, scalable approach to integrating traditional and nontraditional data sources enables both clinical discovery and data-powered patient self-management in epilepsy and can be generalized to other chronic conditions.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"167-181"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In vitro to in vivo translation of artificial intelligence for clinical use: screening for acute coronary syndrome to identify ST-elevation myocardial infarction. 体外到体内翻译的人工智能临床应用:筛查急性冠状动脉综合征识别st段抬高型心肌梗死
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf101
Gabrielle Bunney, Kate Miller, Anna Graber-Naidich, Rana Kabeer, Sean M Bloos, Alexander J Wessels, Melissa A Pasao, Marium Rizvi, Ian P Brown, Maame Yaa A B Yiadom

Objective: The integration of predictive models into live clinical care requires scientific testing before implementation to ensure patient safety. We built and technically implemented a model that predicts which patients require an electrocardiogram (ECG) to screen for heart attacks within 10 minutes of their arrival to the Emergency Department. We developed a structured framework for the in vitro to in vivo translation of the model through implementation as clinical decision support (CDS).

Materials and methods: The CDS ran as a silent pilot for 2 months. We conducted (1) a Technical Component Analysis to ensure each part of the CDS coding functioned as planned, and (2) a Technical Fidelity Analysis to ensure agreement between the CDS's in vivo and the model's in vitro screening decisions.

Results: The Technical Component Analysis indicated several small coding errors in CDS components that were addressed. During this period, the CDS processed 18 335 patient encounters. CDS fidelity to the model reflected raw agreement of 95.5% (CI, 95.2%-95.9%) and kappa of 87.6% (CI, 86.7%-88.6%). Additional coding errors were identified and were corrected.

Discussion: Our structured framework for the in vitro to in vivo translation of our predictive model uncovered ways to improve performance in vivo and the validity of risk assessment decisions. Testing predictive models on live care data and accompanying analyses is necessary to safely implement a predictive model for clinical use.

Conclusion: We developed a method for the translation of our model from in vitro to in vivo that can be utilized with other applications of predictive modeling in healthcare.

目的:将预测模型整合到临床现场护理中,需要在实施前进行科学的测试,以确保患者的安全。我们建立并在技术上实现了一个模型,该模型可以预测哪些患者在到达急诊科10分钟内需要心电图(ECG)来筛查心脏病发作。我们通过临床决策支持(CDS)的实施,为模型的体外到体内翻译开发了一个结构化框架。材料与方法:cd作为无声先导运行2个月。我们进行了(1)技术成分分析,以确保CDS编码的每个部分按计划发挥作用;(2)技术保真度分析,以确保体内CDS和模型体外筛选决策之间的一致性。结果:技术成分分析表明,几个小的编码错误的CDS组件被解决。在此期间,CDS处理了18 335例患者就诊。CDS对模型的保真度反映了95.5% (CI, 95.2%-95.9%)和87.6% (CI, 86.7%-88.6%)的原始一致性。发现并纠正了其他编码错误。讨论:我们的预测模型的体外到体内翻译的结构化框架揭示了提高体内性能和风险评估决策有效性的方法。对现场护理数据和伴随的分析测试预测模型是必要的,以安全实现用于临床使用的预测模型。结论:我们开发了一种将我们的模型从体外翻译到体内的方法,可以用于医疗保健预测建模的其他应用。
{"title":"In vitro to in vivo translation of artificial intelligence for clinical use: screening for acute coronary syndrome to identify ST-elevation myocardial infarction.","authors":"Gabrielle Bunney, Kate Miller, Anna Graber-Naidich, Rana Kabeer, Sean M Bloos, Alexander J Wessels, Melissa A Pasao, Marium Rizvi, Ian P Brown, Maame Yaa A B Yiadom","doi":"10.1093/jamia/ocaf101","DOIUrl":"10.1093/jamia/ocaf101","url":null,"abstract":"<p><strong>Objective: </strong>The integration of predictive models into live clinical care requires scientific testing before implementation to ensure patient safety. We built and technically implemented a model that predicts which patients require an electrocardiogram (ECG) to screen for heart attacks within 10 minutes of their arrival to the Emergency Department. We developed a structured framework for the in vitro to in vivo translation of the model through implementation as clinical decision support (CDS).</p><p><strong>Materials and methods: </strong>The CDS ran as a silent pilot for 2 months. We conducted (1) a Technical Component Analysis to ensure each part of the CDS coding functioned as planned, and (2) a Technical Fidelity Analysis to ensure agreement between the CDS's in vivo and the model's in vitro screening decisions.</p><p><strong>Results: </strong>The Technical Component Analysis indicated several small coding errors in CDS components that were addressed. During this period, the CDS processed 18 335 patient encounters. CDS fidelity to the model reflected raw agreement of 95.5% (CI, 95.2%-95.9%) and kappa of 87.6% (CI, 86.7%-88.6%). Additional coding errors were identified and were corrected.</p><p><strong>Discussion: </strong>Our structured framework for the in vitro to in vivo translation of our predictive model uncovered ways to improve performance in vivo and the validity of risk assessment decisions. Testing predictive models on live care data and accompanying analyses is necessary to safely implement a predictive model for clinical use.</p><p><strong>Conclusion: </strong>We developed a method for the translation of our model from in vitro to in vivo that can be utilized with other applications of predictive modeling in healthcare.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"7-14"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transport-based transfer learning on Electronic Health Records: application to detection of treatment disparities. 基于传输的电子健康记录迁移学习:应用于治疗差异的检测。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf134
Wanxin Li, Saad Ahmed, Yongjin P Park, Khanh Dao Duc

Objectives: Electronic Health Records (EHRs) sampled from different populations can introduce unwanted biases, limit individual-level data sharing, and make the data and fitted model hardly transferable across different population groups. In this context, our main goal is to design an effective method to transfer knowledge between population groups, with computable guarantees for suitability, and that can be applied to quantify treatment disparities.

Materials and methods: For a model trained in an embedded feature space of one subgroup, our proposed framework, Optimal Transport-based Transfer Learning for EHRs (OTTEHR), combines feature embedding of the data and unbalanced optimal transport (OT) for domain adaptation to another population group. To test our method, we processed and divided the MIMIC-III and MIMIC-IV databases into multiple population groups using ICD codes and multiple labels.

Results: We derive a theoretical bound for the generalization error of our method, and interpret it in terms of the Wasserstein distance, unbalancedness between the source and target domains, and labeling divergence, which can be used as a guide for assessing the suitability of binary classification and regression tasks. In general, our method achieves better accuracy and computational efficiency compared with standard and machine learning transfer learning methods on various tasks. Upon testing our method for populations with different insurance plans, we detect various levels of disparities in hospital duration stay between groups.

Discussion and conclusion: By leveraging tools from OT theory, our proposed framework allows to compare statistical models on EHR data between different population groups. As a potential application for clinical decision making, we quantify treatment disparities between different population groups. Future directions include applying OTTEHR to broader regression and classification tasks and extending the method to semi-supervised learning.

目的:从不同人群中取样的电子健康记录(EHRs)可能会引入不必要的偏差,限制个人层面的数据共享,并使数据和拟合模型难以在不同人群中转移。在这种情况下,我们的主要目标是设计一种有效的方法来在人口群体之间传递知识,具有可计算的适用性保证,并可用于量化治疗差异。材料和方法:对于在一个子群体的嵌入特征空间中训练的模型,我们提出的框架,基于最优传输的电子病历迁移学习(OTTEHR),结合了数据的特征嵌入和不平衡最优传输(OT),以适应另一个群体的领域。为了验证我们的方法,我们使用ICD代码和多个标签对MIMIC-III和MIMIC-IV数据库进行处理并将其划分为多个种群组。结果:我们推导了方法泛化误差的理论边界,并从Wasserstein距离、源域和目标域之间的不平衡以及标记分歧等方面对其进行了解释,可以作为评估二元分类和回归任务适用性的指导。总的来说,在各种任务上,与标准迁移学习方法和机器学习迁移学习方法相比,我们的方法获得了更好的精度和计算效率。在对不同保险计划的人群测试我们的方法后,我们发现各组之间住院时间的不同程度的差异。讨论和结论:通过利用OT理论的工具,我们提出的框架允许比较不同人群之间电子病历数据的统计模型。作为临床决策的潜在应用,我们量化了不同人群之间的治疗差异。未来的方向包括将OTTEHR应用于更广泛的回归和分类任务,并将该方法扩展到半监督学习。
{"title":"Transport-based transfer learning on Electronic Health Records: application to detection of treatment disparities.","authors":"Wanxin Li, Saad Ahmed, Yongjin P Park, Khanh Dao Duc","doi":"10.1093/jamia/ocaf134","DOIUrl":"10.1093/jamia/ocaf134","url":null,"abstract":"<p><strong>Objectives: </strong>Electronic Health Records (EHRs) sampled from different populations can introduce unwanted biases, limit individual-level data sharing, and make the data and fitted model hardly transferable across different population groups. In this context, our main goal is to design an effective method to transfer knowledge between population groups, with computable guarantees for suitability, and that can be applied to quantify treatment disparities.</p><p><strong>Materials and methods: </strong>For a model trained in an embedded feature space of one subgroup, our proposed framework, Optimal Transport-based Transfer Learning for EHRs (OTTEHR), combines feature embedding of the data and unbalanced optimal transport (OT) for domain adaptation to another population group. To test our method, we processed and divided the MIMIC-III and MIMIC-IV databases into multiple population groups using ICD codes and multiple labels.</p><p><strong>Results: </strong>We derive a theoretical bound for the generalization error of our method, and interpret it in terms of the Wasserstein distance, unbalancedness between the source and target domains, and labeling divergence, which can be used as a guide for assessing the suitability of binary classification and regression tasks. In general, our method achieves better accuracy and computational efficiency compared with standard and machine learning transfer learning methods on various tasks. Upon testing our method for populations with different insurance plans, we detect various levels of disparities in hospital duration stay between groups.</p><p><strong>Discussion and conclusion: </strong>By leveraging tools from OT theory, our proposed framework allows to compare statistical models on EHR data between different population groups. As a potential application for clinical decision making, we quantify treatment disparities between different population groups. Future directions include applying OTTEHR to broader regression and classification tasks and extending the method to semi-supervised learning.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"15-25"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758479/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing diagnostic precision for rare diseases using case-based reasoning. 利用基于案例的推理提高罕见病的诊断精度。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf092
Richard Noll, Alexandra Berger, Carlo Facchinello, Katharina Stratmann, Jannik Schaaf, Holger Storf

Objective: This study aims to enhance the diagnostic process for rare diseases using case-based reasoning (CBR). CBR compares new cases with historical data, utilizing both structured and unstructured clinical data.

Materials and methods: The study uses a dataset of 4295 patient cases from the University Hospital Frankfurt. Data were standardized using the OMOP Common Data Model. Three methods-TF, TF-IDF, and TF-IDF with semantic vector embeddings-were employed to represent patient records. Similarity search effectiveness was evaluated using cross-validation to assess diagnostic precision. High-weighted concepts were rated by medical experts for relevance. Additionally, the impact of different levels of ICD-10 code granularity on prediction outcomes was analyzed.

Results: The TF-IDF method showed a high degree of precision, with an average positive predictive value of 91% in the 10 most similar cases. The differences between the methods were not statistically significant. The expert evaluation rated the medical relevance of high-weighted concepts as moderate. The granularity of ICD-10 coding significantly influences the precision of predictions, with more granular codes showing decreased precision.

Discussion: The methods effectively handle data from multiple medical specialties, suggesting broad applicability. The use of broader ICD-10 codes with high precision in prediction could improve initial diagnostic guidance. The use of Explainable AI could enhance diagnostic transparency, leading to better patient outcomes. Limitations include standardization issues and the need for more comprehensive lab value integration.

Conclusion: While CBR shows promise for rare disease diagnostics, its utility depends on the specific needs of the decision support system and its intended clinical application.

目的:利用基于案例的推理(CBR)提高罕见病的诊断水平。CBR利用结构化和非结构化临床数据,将新病例与历史数据进行比较。材料和方法:该研究使用了来自法兰克福大学医院的4295例患者的数据集。使用OMOP公共数据模型对数据进行标准化。采用tf、TF-IDF和TF-IDF与语义向量嵌入三种方法来表示患者记录。使用交叉验证评估相似性搜索的有效性,以评估诊断的准确性。高权重概念的相关性由医学专家评定。此外,还分析了不同级别ICD-10代码粒度对预测结果的影响。结果:TF-IDF方法精密度高,对10例最相似病例的平均阳性预测值为91%。两种方法间差异无统计学意义。专家评价将高权重概念的医学相关性评为中等。ICD-10编码的粒度显著影响预测的精度,越细的编码精度越低。讨论:该方法有效地处理了多个医学专业的数据,表明了广泛的适用性。使用范围更广、预测精度高的ICD-10编码可以改善初步诊断指导。使用可解释的人工智能可以提高诊断的透明度,从而改善患者的治疗效果。限制包括标准化问题和需要更全面的实验室价值集成。结论:虽然CBR显示出罕见病诊断的前景,但其效用取决于决策支持系统的具体需求及其预期的临床应用。
{"title":"Enhancing diagnostic precision for rare diseases using case-based reasoning.","authors":"Richard Noll, Alexandra Berger, Carlo Facchinello, Katharina Stratmann, Jannik Schaaf, Holger Storf","doi":"10.1093/jamia/ocaf092","DOIUrl":"10.1093/jamia/ocaf092","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to enhance the diagnostic process for rare diseases using case-based reasoning (CBR). CBR compares new cases with historical data, utilizing both structured and unstructured clinical data.</p><p><strong>Materials and methods: </strong>The study uses a dataset of 4295 patient cases from the University Hospital Frankfurt. Data were standardized using the OMOP Common Data Model. Three methods-TF, TF-IDF, and TF-IDF with semantic vector embeddings-were employed to represent patient records. Similarity search effectiveness was evaluated using cross-validation to assess diagnostic precision. High-weighted concepts were rated by medical experts for relevance. Additionally, the impact of different levels of ICD-10 code granularity on prediction outcomes was analyzed.</p><p><strong>Results: </strong>The TF-IDF method showed a high degree of precision, with an average positive predictive value of 91% in the 10 most similar cases. The differences between the methods were not statistically significant. The expert evaluation rated the medical relevance of high-weighted concepts as moderate. The granularity of ICD-10 coding significantly influences the precision of predictions, with more granular codes showing decreased precision.</p><p><strong>Discussion: </strong>The methods effectively handle data from multiple medical specialties, suggesting broad applicability. The use of broader ICD-10 codes with high precision in prediction could improve initial diagnostic guidance. The use of Explainable AI could enhance diagnostic transparency, leading to better patient outcomes. Limitations include standardization issues and the need for more comprehensive lab value integration.</p><p><strong>Conclusion: </strong>While CBR shows promise for rare disease diagnostics, its utility depends on the specific needs of the decision support system and its intended clinical application.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"98-111"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMART: a new patient similarity estimation framework for enhanced predictive modeling in acute kidney injury. SMART:一个新的患者相似性估计框架,用于增强急性肾损伤的预测建模。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf125
Deyi Li, Alan S L Yu, Dana Y Fuhrman, Mei Liu

Objective: Accurately measuring patient similarity is essential for precision medicine, enabling personalized predictive modeling, disease subtyping, and individualized treatment by identifying patients with similar characteristics to an index patient. This study aims to develop an electronic health record-based patient similarity estimation framework to enhance personalized predictive modeling for Acute Kidney Injury (AKI), a complex and life-threatening condition where accurate prediction is critical for timely intervention.

Materials and methods: We introduce Similarity Measurement for Acute Kidney Injury Risk Tracking (SMART), a new patient similarity estimation framework with 3 key enhancements: (1) overlap weighting to adjust similarity scores; (2) distance measure optimization; and (3) feature type weight optimization. These enhancements were evaluated using internal and external validation datasets from 2 tertiary academic hospitals to predict AKI risk across varying group sizes of similar patients.

Results: The study analyzed data from 8637 patients in the reference patient pool and 8542 patients in each of the internal and external test sets. Each enhancement was independently evaluated while controlling for other variables to determine its impact on prediction performance. SMART consistently outperformed 3 baseline models on both the internal and external test sets (P<.05) and demonstrated improved performance in certain subpopulations with unique health profiles compared to a traditional machine learning approach.

Discussion: SMART improves the identification of high-quality similar patient groups, enhancing the accuracy of personalized AKI prediction across various group sizes. By accurately identifying clinically relevant similar patients, clinicians can tailor treatments more effectively, advancing personalized care.

目的:准确测量患者相似性对于精准医疗至关重要,通过识别与指标患者特征相似的患者,实现个性化预测建模、疾病分型和个性化治疗。本研究旨在开发一个基于电子健康记录的患者相似性估计框架,以增强急性肾损伤(AKI)的个性化预测建模。急性肾损伤是一种复杂且危及生命的疾病,准确预测对及时干预至关重要。材料和方法:我们介绍了用于急性肾损伤风险跟踪的相似性测量(SMART),这是一种新的患者相似性估计框架,具有3个关键增强:(1)重叠加权来调整相似性分数;(2)距离测度优化;(3)特征类型权重优化。使用来自两所三级学术医院的内部和外部验证数据集对这些增强进行评估,以预测不同规模的相似患者的AKI风险。结果:该研究分析了参考患者池中的8637例患者和内部和外部测试组中的8542例患者的数据。在控制其他变量以确定其对预测性能的影响的同时,对每个增强进行独立评估。SMART在内部和外部测试集上始终优于3个基线模型(p讨论:SMART提高了对高质量相似患者群体的识别,提高了不同群体规模的个性化AKI预测的准确性。通过准确识别临床相关的类似患者,临床医生可以更有效地定制治疗,推进个性化护理。
{"title":"SMART: a new patient similarity estimation framework for enhanced predictive modeling in acute kidney injury.","authors":"Deyi Li, Alan S L Yu, Dana Y Fuhrman, Mei Liu","doi":"10.1093/jamia/ocaf125","DOIUrl":"10.1093/jamia/ocaf125","url":null,"abstract":"<p><strong>Objective: </strong>Accurately measuring patient similarity is essential for precision medicine, enabling personalized predictive modeling, disease subtyping, and individualized treatment by identifying patients with similar characteristics to an index patient. This study aims to develop an electronic health record-based patient similarity estimation framework to enhance personalized predictive modeling for Acute Kidney Injury (AKI), a complex and life-threatening condition where accurate prediction is critical for timely intervention.</p><p><strong>Materials and methods: </strong>We introduce Similarity Measurement for Acute Kidney Injury Risk Tracking (SMART), a new patient similarity estimation framework with 3 key enhancements: (1) overlap weighting to adjust similarity scores; (2) distance measure optimization; and (3) feature type weight optimization. These enhancements were evaluated using internal and external validation datasets from 2 tertiary academic hospitals to predict AKI risk across varying group sizes of similar patients.</p><p><strong>Results: </strong>The study analyzed data from 8637 patients in the reference patient pool and 8542 patients in each of the internal and external test sets. Each enhancement was independently evaluated while controlling for other variables to determine its impact on prediction performance. SMART consistently outperformed 3 baseline models on both the internal and external test sets (P<.05) and demonstrated improved performance in certain subpopulations with unique health profiles compared to a traditional machine learning approach.</p><p><strong>Discussion: </strong>SMART improves the identification of high-quality similar patient groups, enhancing the accuracy of personalized AKI prediction across various group sizes. By accurately identifying clinically relevant similar patients, clinicians can tailor treatments more effectively, advancing personalized care.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"37-48"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using large language models to detect outcomes in qualitative studies of adolescent depression. 使用大型语言模型来检测青少年抑郁症定性研究的结果。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae298
Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura

Objective: We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.

Materials and methods: Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).

Results: We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.

Discussion: LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.

Conclusion: Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.

目的:我们的目标是使用大型语言模型(LLMs)来检测提及的细致入微的心理治疗结果和影响,而不是之前在青少年抑郁症访谈记录中考虑的。我们的临床作者之前创建了一个新的编码框架,其中包含了超越二元分类(例如,抑郁症与对照组)的细粒度治疗结果,该框架基于抑郁症临床研究中的定性分析。此外,我们试图证明法学硕士的嵌入信息足够准确地标记这些经验。材料和方法:数据来自访谈,其中文本片段用不同的结果标签进行注释。评估了五种不同的开源llm,以对编码框架的结果进行分类。对原始访谈笔录进行分类实验。此外,我们重复了这些实验,通过将这些片段分解为对话回合,或保留非采访者的话语(独白)来产生不同版本的数据。结果:我们使用分类模型预测了31个结果和8个衍生标签,用于3种不同的文本分割。原始分割的ROC曲线下面积得分在0.6到0.9之间,独白和回合得分在0.7到1.0之间。讨论:基于法学硕士的分类模型可以识别对青少年重要的结果,如友谊或学术和职业功能,在患者访谈的文本记录中。通过使用临床数据,与基于公共社交媒体数据的研究相比,我们还旨在更好地推广到临床环境。结论:我们的研究结果表明,在心理治疗文本中进行细粒度的治疗结果编码是可行的,并且可以用于支持下游用途的重要结果的量化。
{"title":"Using large language models to detect outcomes in qualitative studies of adolescent depression.","authors":"Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura","doi":"10.1093/jamia/ocae298","DOIUrl":"10.1093/jamia/ocae298","url":null,"abstract":"<p><strong>Objective: </strong>We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.</p><p><strong>Materials and methods: </strong>Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).</p><p><strong>Results: </strong>We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.</p><p><strong>Discussion: </strong>LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.</p><p><strong>Conclusion: </strong>Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"79-89"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758459/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting intracranial pressure monitor placement in children with traumatic brain injury: a prospective cohort study to develop a clinical decision support tool. 预测外伤性脑损伤儿童颅内压监测仪的放置:一项开发临床决策支持工具的前瞻性队列研究。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf120
Seth Russell, Peter E DeWitt, Laura Helmkamp, Kathryn Colborn, Charlotte Gray, Margaret Rebull, Yamila L Sierra, Rachel Greer, Lexi Petruccelli, Sara Shankman, Todd C Hankinson, Fuyong Xing, David J Albers, Tellen D Bennett

Objective: Clinicians currently make decisions about placing an intracranial pressure (ICP) monitor in children with traumatic brain injury (TBI) without the benefit of an accurate clinical decision support tool. The goal of this study was to develop and validate a model that predicts placement of an ICP monitor and updates as new information becomes available.

Materials and methods: A prospective observational cohort study was conducted from September 2014 to January 2024. The setting included one US hospital designated as an American College of Surgeons Level 1 Pediatric Trauma Center. Participants were 389 children with acute TBI admitted to the ICU who had at least one Glasgow Coma Scale (GCS) score ≤ 8 or intubation with at least one GCS-Motor ≤ 5. We excluded children who received ICP monitors prior to arrival, those with GCS = 3 and bilateral fixed, dilated pupils, and those with a do not resuscitate order.

Results: Of the 389 participants, 138 received ICP monitoring. Several machine learning models, including a recurrent neural network (RNN), were developed and validated using 4 combinations of input data. The best performing model, an RNN, achieved an F1 of 0.71 within 720 minutes of hospital arrival. The cumulative F1 of the RNN from minute 0 to 720 was 0.61. The best performing non-neural network model, standard logistic regression, achieved an F1 of 0.36 within 720 minutes of hospital arrival.

Conclusions: These findings will contribute to design and implementation of a multidisciplinary clinical decision support tool for ICP monitor placement in children with TBI.

目的:临床医生目前在没有准确的临床决策支持工具的情况下决定在创伤性脑损伤(TBI)儿童中放置颅内压(ICP)监测仪。本研究的目的是开发和验证一个模型,该模型可以预测ICP监测仪的放置位置,并在获得新信息时进行更新。材料与方法:2014年9月至2024年1月进行前瞻性观察队列研究。其中包括一家被指定为美国外科医师学会一级儿科创伤中心的美国医院。参与者是389名入院ICU的急性TBI患儿,至少有一项格拉斯哥昏迷评分(GCS)评分≤8或至少有一项GCS- motor插管评分≤5。我们排除了入院前接受过颅内压监护的儿童、GCS = 3、双侧固定、瞳孔扩大的儿童以及有不复苏命令的儿童。结果:在389名参与者中,138人接受了ICP监测。使用4种输入数据组合开发并验证了几种机器学习模型,包括循环神经网络(RNN)。表现最好的模型是RNN,在到达医院的720分钟内达到了0.71的F1。从0分钟到720分钟,RNN的累积F1为0.61。表现最好的非神经网络模型,标准逻辑回归,在到达医院720分钟内达到了0.36的F1。结论:这些发现将有助于设计和实施一种多学科的临床决策支持工具,用于颅脑损伤儿童ICP监护仪的放置。
{"title":"Predicting intracranial pressure monitor placement in children with traumatic brain injury: a prospective cohort study to develop a clinical decision support tool.","authors":"Seth Russell, Peter E DeWitt, Laura Helmkamp, Kathryn Colborn, Charlotte Gray, Margaret Rebull, Yamila L Sierra, Rachel Greer, Lexi Petruccelli, Sara Shankman, Todd C Hankinson, Fuyong Xing, David J Albers, Tellen D Bennett","doi":"10.1093/jamia/ocaf120","DOIUrl":"10.1093/jamia/ocaf120","url":null,"abstract":"<p><strong>Objective: </strong>Clinicians currently make decisions about placing an intracranial pressure (ICP) monitor in children with traumatic brain injury (TBI) without the benefit of an accurate clinical decision support tool. The goal of this study was to develop and validate a model that predicts placement of an ICP monitor and updates as new information becomes available.</p><p><strong>Materials and methods: </strong>A prospective observational cohort study was conducted from September 2014 to January 2024. The setting included one US hospital designated as an American College of Surgeons Level 1 Pediatric Trauma Center. Participants were 389 children with acute TBI admitted to the ICU who had at least one Glasgow Coma Scale (GCS) score ≤ 8 or intubation with at least one GCS-Motor ≤ 5. We excluded children who received ICP monitors prior to arrival, those with GCS = 3 and bilateral fixed, dilated pupils, and those with a do not resuscitate order.</p><p><strong>Results: </strong>Of the 389 participants, 138 received ICP monitoring. Several machine learning models, including a recurrent neural network (RNN), were developed and validated using 4 combinations of input data. The best performing model, an RNN, achieved an F1 of 0.71 within 720 minutes of hospital arrival. The cumulative F1 of the RNN from minute 0 to 720 was 0.61. The best performing non-neural network model, standard logistic regression, achieved an F1 of 0.36 within 720 minutes of hospital arrival.</p><p><strong>Conclusions: </strong>These findings will contribute to design and implementation of a multidisciplinary clinical decision support tool for ICP monitor placement in children with TBI.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"182-192"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758473/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144762166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1