首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
VIEWER: an extensible visual analytics framework for enhancing mental healthcare. VIEWER:用于增强精神保健的可扩展可视化分析框架。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf010
Tao Wang, David Codling, Yamiko Joseph Msosa, Matthew Broadbent, Daisy Kornblum, Catherine Polling, Thomas Searle, Claire Delaney-Pope, Barbara Arroyo, Stuart MacLellan, Zoe Keddie, Mary Docherty, Angus Roberts, Robert Stewart, Philip McGuire, Richard Dobson, Robert Harland

Objective: A proof-of-concept study aimed at designing and implementing Visual & Interactive Engagement With Electronic Records (VIEWER), a versatile toolkit for visual analytics of clinical data, and systematically evaluating its effectiveness across various clinical applications while gathering feedback for iterative improvements.

Materials and methods: VIEWER is an open-source and extensible toolkit that employs natural language processing and interactive visualization techniques to facilitate the rapid design, development, and deployment of clinical information retrieval, analysis, and visualization at the point of care. Through an iterative and collaborative participatory design approach, VIEWER was designed and implemented in one of the United Kingdom's largest National Health Services mental health Trusts, where its clinical utility and effectiveness were assessed using both quantitative and qualitative methods.

Results: VIEWER provides interactive, problem-focused, and comprehensive views of longitudinal patient data (n = 409 870) from a combination of structured clinical data and unstructured clinical notes. Despite a relatively short adoption period and users' initial unfamiliarity, VIEWER significantly improved performance and task completion speed compared to the standard clinical information system. More than 1000 users and partners in the hospital tested and used VIEWER, reporting high satisfaction and expressed strong interest in incorporating VIEWER into their daily practice.

Discussion: VIEWER provides a cost-effective enhancement to the functionalities of standard clinical information systems, with evaluation offering valuable feedback for future improvements.

Conclusion: VIEWER was developed to improve data accessibility and representation across various aspects of healthcare delivery, including population health management and patient monitoring. The deployment of VIEWER highlights the benefits of collaborative refinement in optimizing health informatics solutions for enhanced patient care.

目的:一项概念验证研究,旨在设计和实现可视化和交互式电子记录(VIEWER),这是一种用于临床数据可视化分析的多功能工具包,并系统地评估其在各种临床应用中的有效性,同时收集反馈以进行迭代改进。材料和方法:VIEWER是一个开源和可扩展的工具包,它采用自然语言处理和交互式可视化技术来促进临床信息检索、分析和可视化的快速设计、开发和部署。通过迭代和协作参与式设计方法,VIEWER在英国最大的国家卫生服务精神健康信托基金之一中设计和实施,在那里使用定量和定性方法评估了其临床效用和有效性。结果:VIEWER从结构化临床数据和非结构化临床记录的组合中提供了交互式的、以问题为中心的、全面的纵向患者数据视图(n = 408970)。尽管采用时间相对较短,用户最初并不熟悉,但与标准临床信息系统相比,VIEWER显著提高了性能和任务完成速度。超过1000名医院用户和合作伙伴测试和使用了VIEWER,报告了很高的满意度,并表达了将VIEWER纳入日常实践的强烈兴趣。讨论:VIEWER为标准临床信息系统的功能提供了一种经济有效的增强,其评估为未来的改进提供了有价值的反馈。结论:开发VIEWER是为了改善医疗保健服务各个方面的数据可访问性和代表性,包括人口健康管理和患者监测。VIEWER的部署突出了协作改进在优化健康信息解决方案以增强患者护理方面的好处。
{"title":"VIEWER: an extensible visual analytics framework for enhancing mental healthcare.","authors":"Tao Wang, David Codling, Yamiko Joseph Msosa, Matthew Broadbent, Daisy Kornblum, Catherine Polling, Thomas Searle, Claire Delaney-Pope, Barbara Arroyo, Stuart MacLellan, Zoe Keddie, Mary Docherty, Angus Roberts, Robert Stewart, Philip McGuire, Richard Dobson, Robert Harland","doi":"10.1093/jamia/ocaf010","DOIUrl":"10.1093/jamia/ocaf010","url":null,"abstract":"<p><strong>Objective: </strong>A proof-of-concept study aimed at designing and implementing Visual & Interactive Engagement With Electronic Records (VIEWER), a versatile toolkit for visual analytics of clinical data, and systematically evaluating its effectiveness across various clinical applications while gathering feedback for iterative improvements.</p><p><strong>Materials and methods: </strong>VIEWER is an open-source and extensible toolkit that employs natural language processing and interactive visualization techniques to facilitate the rapid design, development, and deployment of clinical information retrieval, analysis, and visualization at the point of care. Through an iterative and collaborative participatory design approach, VIEWER was designed and implemented in one of the United Kingdom's largest National Health Services mental health Trusts, where its clinical utility and effectiveness were assessed using both quantitative and qualitative methods.</p><p><strong>Results: </strong>VIEWER provides interactive, problem-focused, and comprehensive views of longitudinal patient data (n = 409 870) from a combination of structured clinical data and unstructured clinical notes. Despite a relatively short adoption period and users' initial unfamiliarity, VIEWER significantly improved performance and task completion speed compared to the standard clinical information system. More than 1000 users and partners in the hospital tested and used VIEWER, reporting high satisfaction and expressed strong interest in incorporating VIEWER into their daily practice.</p><p><strong>Discussion: </strong>VIEWER provides a cost-effective enhancement to the functionalities of standard clinical information systems, with evaluation offering valuable feedback for future improvements.</p><p><strong>Conclusion: </strong>VIEWER was developed to improve data accessibility and representation across various aspects of healthcare delivery, including population health management and patient monitoring. The deployment of VIEWER highlights the benefits of collaborative refinement in optimizing health informatics solutions for enhanced patient care.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"144-158"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758470/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interdisciplinary development and application of computational methods in informatics for clinical applications. 跨学科发展和临床应用信息学计算方法的应用。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf209
David Albers, Kenrick Cato, Anita Layton, Sarah C Rossetti
{"title":"Interdisciplinary development and application of computational methods in informatics for clinical applications.","authors":"David Albers, Kenrick Cato, Anita Layton, Sarah C Rossetti","doi":"10.1093/jamia/ocaf209","DOIUrl":"10.1093/jamia/ocaf209","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"33 1","pages":"1-6"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758474/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interdisciplinary systems may restore the healthcare professional-patient relationship in electronic health systems. 跨学科系统可以在电子卫生系统中恢复医疗保健专业人员与患者的关系。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf001
Michael R Cauley, Richard J Boland, S Trent Rosenbloom

Objective: To develop a framework that models the impact of electronic health record (EHR) systems on healthcare professionals' well-being and their relationships with patients, using interdisciplinary insights to guide machine learning in identifying value patterns important to healthcare professionals in EHR systems.

Materials and methods: A theoretical framework of EHR systems' implementation was developed using interdisciplinary literature from healthcare, information systems, and management science focusing on the systems approach, clinical decision-making, and interface terminologies.

Observations: Healthcare professionals balance personal norms of narrative and data-driven communication in knowledge creation for EHRs by integrating detailed patient stories with structured data. This integration forms 2 learning loops that create tension in the healthcare professional-patient relationship, shaping how healthcare professionals apply their values in care delivery. The manifestation of this value tension in EHRs directly affects the well-being of healthcare professionals.

Discussion: Understanding the value tension learning loop between structured data and narrative forms lays the groundwork for future studies of how healthcare professionals use EHRs to deliver care, emphasizing their well-being and patient relationships through a sociotechnical lens.

Conclusion: EHR systems can improve the healthcare professional-patient relationship and healthcare professional well-being by integrating norms and values into pattern recognition of narrative and data communication forms.

目的:开发一个框架,模拟电子健康记录(EHR)系统对医疗保健专业人员的福祉及其与患者的关系的影响,使用跨学科的见解来指导机器学习识别电子健康记录系统中对医疗保健专业人员重要的价值模式。材料和方法:利用来自医疗保健、信息系统和管理科学的跨学科文献,开发了EHR系统实施的理论框架,重点关注系统方法、临床决策和接口术语。观察:医疗保健专业人员通过将详细的患者故事与结构化数据相结合,在电子病历的知识创造中平衡个人叙述规范和数据驱动的沟通。这种整合形成了两个学习循环,在医疗保健专业人员与患者的关系中产生紧张关系,塑造了医疗保健专业人员如何在医疗服务中应用他们的价值观。这种价值张力在电子病历中的表现直接影响到医疗保健专业人员的福祉。讨论:理解结构化数据和叙事形式之间的价值张力学习循环,为医疗保健专业人员如何使用电子病历提供护理的未来研究奠定基础,通过社会技术视角强调他们的福祉和患者关系。结论:电子健康档案系统通过将规范和价值观融入叙事和数据沟通形式的模式识别中,可以改善医患关系和医护人员幸福感。
{"title":"Interdisciplinary systems may restore the healthcare professional-patient relationship in electronic health systems.","authors":"Michael R Cauley, Richard J Boland, S Trent Rosenbloom","doi":"10.1093/jamia/ocaf001","DOIUrl":"10.1093/jamia/ocaf001","url":null,"abstract":"<p><strong>Objective: </strong>To develop a framework that models the impact of electronic health record (EHR) systems on healthcare professionals' well-being and their relationships with patients, using interdisciplinary insights to guide machine learning in identifying value patterns important to healthcare professionals in EHR systems.</p><p><strong>Materials and methods: </strong>A theoretical framework of EHR systems' implementation was developed using interdisciplinary literature from healthcare, information systems, and management science focusing on the systems approach, clinical decision-making, and interface terminologies.</p><p><strong>Observations: </strong>Healthcare professionals balance personal norms of narrative and data-driven communication in knowledge creation for EHRs by integrating detailed patient stories with structured data. This integration forms 2 learning loops that create tension in the healthcare professional-patient relationship, shaping how healthcare professionals apply their values in care delivery. The manifestation of this value tension in EHRs directly affects the well-being of healthcare professionals.</p><p><strong>Discussion: </strong>Understanding the value tension learning loop between structured data and narrative forms lays the groundwork for future studies of how healthcare professionals use EHRs to deliver care, emphasizing their well-being and patient relationships through a sociotechnical lens.</p><p><strong>Conclusion: </strong>EHR systems can improve the healthcare professional-patient relationship and healthcare professional well-being by integrating norms and values into pattern recognition of narrative and data communication forms.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"227-233"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758461/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143015034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DySurv: dynamic deep learning model for survival analysis with conditional variational inference. DySurv:利用条件变异推理进行生存分析的动态深度学习模型。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae271
Munib Mesinovic, Peter Watkinson, Tingting Zhu

Objective: Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically.

Materials and methods: DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV).

Results: DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets.

Discussion: Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.

Conclusion: While our method leverages non-parametric extensions to deep learning-guided estimations of the survival distribution, further deep learning paradigms could be explored.

目的:纵向电子健康记录的机器学习应用通常预测固定时间点的事件风险,而生存分析则通过估计时间到事件的分布来实现动态风险预测。在此,我们提出了一种基于条件变异自动编码器的新型方法 DySurv,它结合使用电子健康记录中的静态和纵向测量值来动态估计个体的死亡风险:DySurv 可直接估算累积风险发生函数,而无需对事件发生时间的基本随机过程做出任何参数假设。我们在医疗保健领域的 6 个时间到事件基准数据集以及从 eICU 协作研究(eICU)和重症监护医疗信息市场数据库(MIMIC-IV)中提取的 2 个真实重症监护病房(ICU)电子健康记录(EHR)数据集上对 DySurv 进行了评估:DySurv在时间到事件分析的一致性和其他指标方面优于其他现有的统计和深度学习方法。在 eICU 病例中,它实现了超过 60% 的时间相关一致性。它的准确性和灵敏度也比使用中的 ICU 评分(如急性生理学和慢性健康评估(APACHE)和序贯器官衰竭评估(SOFA)评分)高出 12% 和 22%。DySurv 的预测能力是一致的,在不同的数据集上,存活率估计值仍然是不同的:我们的跨学科框架成功地将深度学习、生存分析和重症监护结合在一起,创建了一种从纵向健康记录中进行时间到事件预测的新方法。我们在来自各种医疗数据集的多个保留测试集上测试了我们的方法,并将其与现有的在用临床风险评分基准进行了比较:结论:虽然我们的方法利用了深度学习引导的生存分布估计的非参数扩展,但还可以探索更多的深度学习范式。
{"title":"DySurv: dynamic deep learning model for survival analysis with conditional variational inference.","authors":"Munib Mesinovic, Peter Watkinson, Tingting Zhu","doi":"10.1093/jamia/ocae271","DOIUrl":"10.1093/jamia/ocae271","url":null,"abstract":"<p><strong>Objective: </strong>Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically.</p><p><strong>Materials and methods: </strong>DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV).</p><p><strong>Results: </strong>DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets.</p><p><strong>Discussion: </strong>Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.</p><p><strong>Conclusion: </strong>While our method leverages non-parametric extensions to deep learning-guided estimations of the survival distribution, further deep learning paradigms could be explored.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"112-122"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
myAURA: a personalized health library for epilepsy management via knowledge graph sparsification and visualization. myAURA:通过知识图谱稀疏化和可视化,为癫痫管理提供个性化的健康图书馆。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf012
Rion Brattig Correia, Jordan C Rozum, Leonard Cross, Jack Felag, Michael Gallant, Ziqi Guo, Bruce W Herr, Aehong Min, Jon Sanchez-Valle, Deborah Stungis Rocha, Alfonso Valencia, Xuan Wang, Katy Börner, Wendy Miller, Luis M Rocha

Objectives: Report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and clinicians in making decisions about self-management and care.

Materials and methods: myAURA rests on an unprecedented collection of epilepsy-relevant heterogeneous data resources, such as biomedical databases, social media, and electronic health records (EHRs). We use a patient-centered biomedical dictionary to link the collected data in a multilayer knowledge graph (KG) computed with a generalizable, open-source methodology.

Results: Our approach is based on a novel network sparsification method that uses the metric backbone of weighted graphs to discover important edges for inference, recommendation, and visualization. We demonstrate by studying drug-drug interaction from EHRs, extracting epilepsy-focused digital cohorts from social media, and generating a multilayer KG visualization. We also present our patient-centered design and pilot-testing of myAURA, including its user interface.

Discussion: The ability to search and explore myAURA's heterogeneous data sources in a single, sparsified, multilayer KG is highly useful for a range of epilepsy studies and stakeholder support.

Conclusion: Our stakeholder-driven, scalable approach to integrating traditional and nontraditional data sources enables both clinical discovery and data-powered patient self-management in epilepsy and can be generalized to other chronic conditions.

目的:报告以患者为中心的myAURA应用程序和一套方法的发展,旨在帮助癫痫患者、护理人员和临床医生做出自我管理和护理的决策。材料和方法:myAURA依赖于前所未有的与癫痫相关的异构数据资源,如生物医学数据库、社交媒体和电子健康记录(EHRs)。我们使用以患者为中心的生物医学词典,将收集到的数据链接到一个多层知识图(KG)中,该知识图采用可推广的开源方法计算。结果:我们的方法基于一种新颖的网络稀疏化方法,该方法使用加权图的度量主干来发现用于推理、推荐和可视化的重要边缘。我们通过从电子病历中研究药物-药物相互作用,从社交媒体中提取以癫痫为中心的数字队列,并生成多层KG可视化来证明。我们还介绍了以患者为中心的设计和myAURA的试点测试,包括它的用户界面。讨论:在单一、稀疏、多层KG中搜索和探索myAURA异构数据源的能力对一系列癫痫研究和利益相关者支持非常有用。结论:我们的利益相关者驱动、可扩展的方法整合了传统和非传统数据源,使癫痫的临床发现和数据驱动的患者自我管理成为可能,并可推广到其他慢性疾病。
{"title":"myAURA: a personalized health library for epilepsy management via knowledge graph sparsification and visualization.","authors":"Rion Brattig Correia, Jordan C Rozum, Leonard Cross, Jack Felag, Michael Gallant, Ziqi Guo, Bruce W Herr, Aehong Min, Jon Sanchez-Valle, Deborah Stungis Rocha, Alfonso Valencia, Xuan Wang, Katy Börner, Wendy Miller, Luis M Rocha","doi":"10.1093/jamia/ocaf012","DOIUrl":"10.1093/jamia/ocaf012","url":null,"abstract":"<p><strong>Objectives: </strong>Report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and clinicians in making decisions about self-management and care.</p><p><strong>Materials and methods: </strong>myAURA rests on an unprecedented collection of epilepsy-relevant heterogeneous data resources, such as biomedical databases, social media, and electronic health records (EHRs). We use a patient-centered biomedical dictionary to link the collected data in a multilayer knowledge graph (KG) computed with a generalizable, open-source methodology.</p><p><strong>Results: </strong>Our approach is based on a novel network sparsification method that uses the metric backbone of weighted graphs to discover important edges for inference, recommendation, and visualization. We demonstrate by studying drug-drug interaction from EHRs, extracting epilepsy-focused digital cohorts from social media, and generating a multilayer KG visualization. We also present our patient-centered design and pilot-testing of myAURA, including its user interface.</p><p><strong>Discussion: </strong>The ability to search and explore myAURA's heterogeneous data sources in a single, sparsified, multilayer KG is highly useful for a range of epilepsy studies and stakeholder support.</p><p><strong>Conclusion: </strong>Our stakeholder-driven, scalable approach to integrating traditional and nontraditional data sources enables both clinical discovery and data-powered patient self-management in epilepsy and can be generalized to other chronic conditions.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"167-181"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758476/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A perspective on individualized treatment effects estimation from time-series health data. 从时间序列健康数据估计个体化治疗效果的视角。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocae323
Ghadeer O Ghosheh, Moritz Gögl, Tingting Zhu

Objectives: The objective of this study is to provide an overview of the current landscape of individualized treatment effects (ITE) estimation, specifically focusing on methodologies proposed for time-series electronic health records (EHRs). We aim to identify gaps in the literature, discuss challenges, and propose future research directions to advance the field of personalized medicine.

Materials and methods: We conducted a comprehensive literature review to identify and analyze relevant works on ITE estimation for time-series data. The review focused on theoretical assumptions, types of treatment settings, and computational frameworks employed in the existing literature.

Results: The literature reveals a growing body of work on ITE estimation for tabular data, while methodologies specific to time-series EHRs are limited. We summarize and discuss the latest advancements, including the types of models proposed, the theoretical foundations, and the computational approaches used.

Discussion: The limitations and challenges of current ITE estimation methods for time-series data are discussed, including the lack of standardized evaluation metrics and the need for more diverse and representative datasets. We also highlight considerations and potential biases that may arise in personalized treatment effect estimation.

Conclusion: This work provides a comprehensive overview of ITE estimation for time-series EHR data, offering insights into the current state of the field and identifying future research directions. By addressing the limitations and challenges, we hope to encourage further exploration and innovation in this exciting and under-studied area of personalized medicine.

目的:本研究的目的是概述个体化治疗效果(ITE)评估的现状,特别关注时间序列电子健康记录(EHRs)提出的方法。我们的目标是找出文献中的差距,讨论挑战,并提出未来的研究方向,以推进个性化医疗领域。材料和方法:我们进行了全面的文献综述,以识别和分析时间序列数据的ITE估计的相关工作。回顾的重点是理论假设,治疗设置的类型,并在现有文献中采用计算框架。结果:文献显示,越来越多的工作对表格数据进行ITE估计,而特定于时间序列电子病历的方法是有限的。我们总结和讨论了最新的进展,包括提出的模型类型、理论基础和使用的计算方法。讨论:讨论了当前时间序列数据的ITE估计方法的局限性和挑战,包括缺乏标准化的评估指标和需要更多样化和更具代表性的数据集。我们还强调了个性化治疗效果估计中可能出现的注意事项和潜在偏差。结论:本工作对时间序列EHR数据的ITE估计进行了全面概述,为该领域的现状提供了见解,并确定了未来的研究方向。通过解决局限性和挑战,我们希望鼓励在个性化医疗这一令人兴奋和研究不足的领域进行进一步的探索和创新。
{"title":"A perspective on individualized treatment effects estimation from time-series health data.","authors":"Ghadeer O Ghosheh, Moritz Gögl, Tingting Zhu","doi":"10.1093/jamia/ocae323","DOIUrl":"10.1093/jamia/ocae323","url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this study is to provide an overview of the current landscape of individualized treatment effects (ITE) estimation, specifically focusing on methodologies proposed for time-series electronic health records (EHRs). We aim to identify gaps in the literature, discuss challenges, and propose future research directions to advance the field of personalized medicine.</p><p><strong>Materials and methods: </strong>We conducted a comprehensive literature review to identify and analyze relevant works on ITE estimation for time-series data. The review focused on theoretical assumptions, types of treatment settings, and computational frameworks employed in the existing literature.</p><p><strong>Results: </strong>The literature reveals a growing body of work on ITE estimation for tabular data, while methodologies specific to time-series EHRs are limited. We summarize and discuss the latest advancements, including the types of models proposed, the theoretical foundations, and the computational approaches used.</p><p><strong>Discussion: </strong>The limitations and challenges of current ITE estimation methods for time-series data are discussed, including the lack of standardized evaluation metrics and the need for more diverse and representative datasets. We also highlight considerations and potential biases that may arise in personalized treatment effect estimation.</p><p><strong>Conclusion: </strong>This work provides a comprehensive overview of ITE estimation for time-series EHR data, offering insights into the current state of the field and identifying future research directions. By addressing the limitations and challenges, we hope to encourage further exploration and innovation in this exciting and under-studied area of personalized medicine.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"234-241"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving clinical decision support through interpretable machine learning and error handling in electronic health records. 通过可解释的机器学习和电子健康记录中的错误处理改进临床决策支持。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf058
Mehak Arora, Hassan Mortagy, Nathan Dwarshuis, Jeffrey Wang, Philip Yang, Andre L Holder, Swati Gupta, Rishikesan Kamaleswaran

Objective: To develop an electronic medical record (EMR) data processing tool that confers clinical context to machine learning (ML) algorithms for error handling, bias mitigation, and interpretability.

Materials and methods: We present Trust-MAPS, an algorithm that translates clinical domain knowledge into high-dimensional, mixed-integer programming models that capture physiological and biological constraints on clinical measurements. EMR data are projected onto this constrained space, effectively bringing outliers to fall within a physiologically feasible range. We then compute the distance of each data point from the constrained space modeling healthy physiology to quantify deviation from the norm. These distances, termed "trust-scores," are integrated into the feature space for downstream ML applications. We demonstrate the utility of Trust-MAPS by training a binary classifier for early sepsis prediction on data from the 2019 PhysioNet Computing in Cardiology Challenge, using the XGBoost algorithm and applying SMOTE for overcoming class-imbalance.

Results: The Trust-MAPS framework shows desirable behavior in handling potential errors and boosting predictive performance. We achieve an area under the receiver operating characteristic curve of 0.91 (95% CI, 0.89-0.92) for predicting sepsis 6 hours before onset-a marked 15% improvement over a baseline model trained without Trust-MAPS.

Discussions: Downstream classification performance improves after Trust-MAPS preprocessing, highlighting the bias reducing capabilities of the error-handling projections. Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks but also lend interpretability to ML models.

Conclusion: This work is the first to translate clinical domain knowledge into mathematical constraints, model cross-vital dependencies, and identify aberrations in high-dimensional medical data. Our method allows for error handling in EMR and confers interpretability and superior predictive power to models trained for clinical decision support.

目的:开发一种电子病历(EMR)数据处理工具,将临床背景赋予机器学习(ML)算法,用于错误处理、偏见缓解和可解释性。材料和方法:我们提出Trust-MAPS,这是一种将临床领域知识转化为高维混合整数规划模型的算法,可以捕获临床测量的生理和生物限制。EMR数据被投射到这个受限的空间,有效地将异常值置于生理上可行的范围内。然后,我们计算每个数据点与健康生理模型约束空间的距离,以量化与规范的偏差。这些距离被称为“信任分数”,被集成到下游ML应用程序的特征空间中。我们利用2019年PhysioNet Computing in Cardiology Challenge的数据,训练一个用于早期败血症预测的二元分类器,并使用XGBoost算法和SMOTE来克服类别不平衡,从而展示了Trust-MAPS的实用性。结果:Trust-MAPS框架在处理潜在错误和提高预测性能方面表现出理想的行为。在发病前6小时预测败血症时,我们实现了受试者工作特征曲线下的面积为0.91 (95% CI, 0.89-0.92)——与未经Trust-MAPS训练的基线模型相比,显著提高了15%。讨论:在Trust-MAPS预处理后,下游分类性能得到改善,突出了错误处理预测的减少偏差的能力。信任分数作为临床有意义的特征出现,不仅提高了临床决策支持任务的预测性能,而且为ML模型提供了可解释性。结论:这项工作首次将临床领域知识转化为数学约束,建立跨生命依赖关系模型,并识别高维医疗数据中的畸变。我们的方法允许在电子病历中的错误处理,并赋予可解释性和卓越的预测能力模型训练临床决策支持。
{"title":"Improving clinical decision support through interpretable machine learning and error handling in electronic health records.","authors":"Mehak Arora, Hassan Mortagy, Nathan Dwarshuis, Jeffrey Wang, Philip Yang, Andre L Holder, Swati Gupta, Rishikesan Kamaleswaran","doi":"10.1093/jamia/ocaf058","DOIUrl":"10.1093/jamia/ocaf058","url":null,"abstract":"<p><strong>Objective: </strong>To develop an electronic medical record (EMR) data processing tool that confers clinical context to machine learning (ML) algorithms for error handling, bias mitigation, and interpretability.</p><p><strong>Materials and methods: </strong>We present Trust-MAPS, an algorithm that translates clinical domain knowledge into high-dimensional, mixed-integer programming models that capture physiological and biological constraints on clinical measurements. EMR data are projected onto this constrained space, effectively bringing outliers to fall within a physiologically feasible range. We then compute the distance of each data point from the constrained space modeling healthy physiology to quantify deviation from the norm. These distances, termed \"trust-scores,\" are integrated into the feature space for downstream ML applications. We demonstrate the utility of Trust-MAPS by training a binary classifier for early sepsis prediction on data from the 2019 PhysioNet Computing in Cardiology Challenge, using the XGBoost algorithm and applying SMOTE for overcoming class-imbalance.</p><p><strong>Results: </strong>The Trust-MAPS framework shows desirable behavior in handling potential errors and boosting predictive performance. We achieve an area under the receiver operating characteristic curve of 0.91 (95% CI, 0.89-0.92) for predicting sepsis 6 hours before onset-a marked 15% improvement over a baseline model trained without Trust-MAPS.</p><p><strong>Discussions: </strong>Downstream classification performance improves after Trust-MAPS preprocessing, highlighting the bias reducing capabilities of the error-handling projections. Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks but also lend interpretability to ML models.</p><p><strong>Conclusion: </strong>This work is the first to translate clinical domain knowledge into mathematical constraints, model cross-vital dependencies, and identify aberrations in high-dimensional medical data. Our method allows for error handling in EMR and confers interpretability and superior predictive power to models trained for clinical decision support.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"123-132"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144003672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In vitro to in vivo translation of artificial intelligence for clinical use: screening for acute coronary syndrome to identify ST-elevation myocardial infarction. 体外到体内翻译的人工智能临床应用:筛查急性冠状动脉综合征识别st段抬高型心肌梗死
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf101
Gabrielle Bunney, Kate Miller, Anna Graber-Naidich, Rana Kabeer, Sean M Bloos, Alexander J Wessels, Melissa A Pasao, Marium Rizvi, Ian P Brown, Maame Yaa A B Yiadom

Objective: The integration of predictive models into live clinical care requires scientific testing before implementation to ensure patient safety. We built and technically implemented a model that predicts which patients require an electrocardiogram (ECG) to screen for heart attacks within 10 minutes of their arrival to the Emergency Department. We developed a structured framework for the in vitro to in vivo translation of the model through implementation as clinical decision support (CDS).

Materials and methods: The CDS ran as a silent pilot for 2 months. We conducted (1) a Technical Component Analysis to ensure each part of the CDS coding functioned as planned, and (2) a Technical Fidelity Analysis to ensure agreement between the CDS's in vivo and the model's in vitro screening decisions.

Results: The Technical Component Analysis indicated several small coding errors in CDS components that were addressed. During this period, the CDS processed 18 335 patient encounters. CDS fidelity to the model reflected raw agreement of 95.5% (CI, 95.2%-95.9%) and kappa of 87.6% (CI, 86.7%-88.6%). Additional coding errors were identified and were corrected.

Discussion: Our structured framework for the in vitro to in vivo translation of our predictive model uncovered ways to improve performance in vivo and the validity of risk assessment decisions. Testing predictive models on live care data and accompanying analyses is necessary to safely implement a predictive model for clinical use.

Conclusion: We developed a method for the translation of our model from in vitro to in vivo that can be utilized with other applications of predictive modeling in healthcare.

目的:将预测模型整合到临床现场护理中,需要在实施前进行科学的测试,以确保患者的安全。我们建立并在技术上实现了一个模型,该模型可以预测哪些患者在到达急诊科10分钟内需要心电图(ECG)来筛查心脏病发作。我们通过临床决策支持(CDS)的实施,为模型的体外到体内翻译开发了一个结构化框架。材料与方法:cd作为无声先导运行2个月。我们进行了(1)技术成分分析,以确保CDS编码的每个部分按计划发挥作用;(2)技术保真度分析,以确保体内CDS和模型体外筛选决策之间的一致性。结果:技术成分分析表明,几个小的编码错误的CDS组件被解决。在此期间,CDS处理了18 335例患者就诊。CDS对模型的保真度反映了95.5% (CI, 95.2%-95.9%)和87.6% (CI, 86.7%-88.6%)的原始一致性。发现并纠正了其他编码错误。讨论:我们的预测模型的体外到体内翻译的结构化框架揭示了提高体内性能和风险评估决策有效性的方法。对现场护理数据和伴随的分析测试预测模型是必要的,以安全实现用于临床使用的预测模型。结论:我们开发了一种将我们的模型从体外翻译到体内的方法,可以用于医疗保健预测建模的其他应用。
{"title":"In vitro to in vivo translation of artificial intelligence for clinical use: screening for acute coronary syndrome to identify ST-elevation myocardial infarction.","authors":"Gabrielle Bunney, Kate Miller, Anna Graber-Naidich, Rana Kabeer, Sean M Bloos, Alexander J Wessels, Melissa A Pasao, Marium Rizvi, Ian P Brown, Maame Yaa A B Yiadom","doi":"10.1093/jamia/ocaf101","DOIUrl":"10.1093/jamia/ocaf101","url":null,"abstract":"<p><strong>Objective: </strong>The integration of predictive models into live clinical care requires scientific testing before implementation to ensure patient safety. We built and technically implemented a model that predicts which patients require an electrocardiogram (ECG) to screen for heart attacks within 10 minutes of their arrival to the Emergency Department. We developed a structured framework for the in vitro to in vivo translation of the model through implementation as clinical decision support (CDS).</p><p><strong>Materials and methods: </strong>The CDS ran as a silent pilot for 2 months. We conducted (1) a Technical Component Analysis to ensure each part of the CDS coding functioned as planned, and (2) a Technical Fidelity Analysis to ensure agreement between the CDS's in vivo and the model's in vitro screening decisions.</p><p><strong>Results: </strong>The Technical Component Analysis indicated several small coding errors in CDS components that were addressed. During this period, the CDS processed 18 335 patient encounters. CDS fidelity to the model reflected raw agreement of 95.5% (CI, 95.2%-95.9%) and kappa of 87.6% (CI, 86.7%-88.6%). Additional coding errors were identified and were corrected.</p><p><strong>Discussion: </strong>Our structured framework for the in vitro to in vivo translation of our predictive model uncovered ways to improve performance in vivo and the validity of risk assessment decisions. Testing predictive models on live care data and accompanying analyses is necessary to safely implement a predictive model for clinical use.</p><p><strong>Conclusion: </strong>We developed a method for the translation of our model from in vitro to in vivo that can be utilized with other applications of predictive modeling in healthcare.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"7-14"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758466/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing diagnostic precision for rare diseases using case-based reasoning. 利用基于案例的推理提高罕见病的诊断精度。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf092
Richard Noll, Alexandra Berger, Carlo Facchinello, Katharina Stratmann, Jannik Schaaf, Holger Storf

Objective: This study aims to enhance the diagnostic process for rare diseases using case-based reasoning (CBR). CBR compares new cases with historical data, utilizing both structured and unstructured clinical data.

Materials and methods: The study uses a dataset of 4295 patient cases from the University Hospital Frankfurt. Data were standardized using the OMOP Common Data Model. Three methods-TF, TF-IDF, and TF-IDF with semantic vector embeddings-were employed to represent patient records. Similarity search effectiveness was evaluated using cross-validation to assess diagnostic precision. High-weighted concepts were rated by medical experts for relevance. Additionally, the impact of different levels of ICD-10 code granularity on prediction outcomes was analyzed.

Results: The TF-IDF method showed a high degree of precision, with an average positive predictive value of 91% in the 10 most similar cases. The differences between the methods were not statistically significant. The expert evaluation rated the medical relevance of high-weighted concepts as moderate. The granularity of ICD-10 coding significantly influences the precision of predictions, with more granular codes showing decreased precision.

Discussion: The methods effectively handle data from multiple medical specialties, suggesting broad applicability. The use of broader ICD-10 codes with high precision in prediction could improve initial diagnostic guidance. The use of Explainable AI could enhance diagnostic transparency, leading to better patient outcomes. Limitations include standardization issues and the need for more comprehensive lab value integration.

Conclusion: While CBR shows promise for rare disease diagnostics, its utility depends on the specific needs of the decision support system and its intended clinical application.

目的:利用基于案例的推理(CBR)提高罕见病的诊断水平。CBR利用结构化和非结构化临床数据,将新病例与历史数据进行比较。材料和方法:该研究使用了来自法兰克福大学医院的4295例患者的数据集。使用OMOP公共数据模型对数据进行标准化。采用tf、TF-IDF和TF-IDF与语义向量嵌入三种方法来表示患者记录。使用交叉验证评估相似性搜索的有效性,以评估诊断的准确性。高权重概念的相关性由医学专家评定。此外,还分析了不同级别ICD-10代码粒度对预测结果的影响。结果:TF-IDF方法精密度高,对10例最相似病例的平均阳性预测值为91%。两种方法间差异无统计学意义。专家评价将高权重概念的医学相关性评为中等。ICD-10编码的粒度显著影响预测的精度,越细的编码精度越低。讨论:该方法有效地处理了多个医学专业的数据,表明了广泛的适用性。使用范围更广、预测精度高的ICD-10编码可以改善初步诊断指导。使用可解释的人工智能可以提高诊断的透明度,从而改善患者的治疗效果。限制包括标准化问题和需要更全面的实验室价值集成。结论:虽然CBR显示出罕见病诊断的前景,但其效用取决于决策支持系统的具体需求及其预期的临床应用。
{"title":"Enhancing diagnostic precision for rare diseases using case-based reasoning.","authors":"Richard Noll, Alexandra Berger, Carlo Facchinello, Katharina Stratmann, Jannik Schaaf, Holger Storf","doi":"10.1093/jamia/ocaf092","DOIUrl":"10.1093/jamia/ocaf092","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to enhance the diagnostic process for rare diseases using case-based reasoning (CBR). CBR compares new cases with historical data, utilizing both structured and unstructured clinical data.</p><p><strong>Materials and methods: </strong>The study uses a dataset of 4295 patient cases from the University Hospital Frankfurt. Data were standardized using the OMOP Common Data Model. Three methods-TF, TF-IDF, and TF-IDF with semantic vector embeddings-were employed to represent patient records. Similarity search effectiveness was evaluated using cross-validation to assess diagnostic precision. High-weighted concepts were rated by medical experts for relevance. Additionally, the impact of different levels of ICD-10 code granularity on prediction outcomes was analyzed.</p><p><strong>Results: </strong>The TF-IDF method showed a high degree of precision, with an average positive predictive value of 91% in the 10 most similar cases. The differences between the methods were not statistically significant. The expert evaluation rated the medical relevance of high-weighted concepts as moderate. The granularity of ICD-10 coding significantly influences the precision of predictions, with more granular codes showing decreased precision.</p><p><strong>Discussion: </strong>The methods effectively handle data from multiple medical specialties, suggesting broad applicability. The use of broader ICD-10 codes with high precision in prediction could improve initial diagnostic guidance. The use of Explainable AI could enhance diagnostic transparency, leading to better patient outcomes. Limitations include standardization issues and the need for more comprehensive lab value integration.</p><p><strong>Conclusion: </strong>While CBR shows promise for rare disease diagnostics, its utility depends on the specific needs of the decision support system and its intended clinical application.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"98-111"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transport-based transfer learning on Electronic Health Records: application to detection of treatment disparities. 基于传输的电子健康记录迁移学习:应用于治疗差异的检测。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-01-01 DOI: 10.1093/jamia/ocaf134
Wanxin Li, Saad Ahmed, Yongjin P Park, Khanh Dao Duc

Objectives: Electronic Health Records (EHRs) sampled from different populations can introduce unwanted biases, limit individual-level data sharing, and make the data and fitted model hardly transferable across different population groups. In this context, our main goal is to design an effective method to transfer knowledge between population groups, with computable guarantees for suitability, and that can be applied to quantify treatment disparities.

Materials and methods: For a model trained in an embedded feature space of one subgroup, our proposed framework, Optimal Transport-based Transfer Learning for EHRs (OTTEHR), combines feature embedding of the data and unbalanced optimal transport (OT) for domain adaptation to another population group. To test our method, we processed and divided the MIMIC-III and MIMIC-IV databases into multiple population groups using ICD codes and multiple labels.

Results: We derive a theoretical bound for the generalization error of our method, and interpret it in terms of the Wasserstein distance, unbalancedness between the source and target domains, and labeling divergence, which can be used as a guide for assessing the suitability of binary classification and regression tasks. In general, our method achieves better accuracy and computational efficiency compared with standard and machine learning transfer learning methods on various tasks. Upon testing our method for populations with different insurance plans, we detect various levels of disparities in hospital duration stay between groups.

Discussion and conclusion: By leveraging tools from OT theory, our proposed framework allows to compare statistical models on EHR data between different population groups. As a potential application for clinical decision making, we quantify treatment disparities between different population groups. Future directions include applying OTTEHR to broader regression and classification tasks and extending the method to semi-supervised learning.

目的:从不同人群中取样的电子健康记录(EHRs)可能会引入不必要的偏差,限制个人层面的数据共享,并使数据和拟合模型难以在不同人群中转移。在这种情况下,我们的主要目标是设计一种有效的方法来在人口群体之间传递知识,具有可计算的适用性保证,并可用于量化治疗差异。材料和方法:对于在一个子群体的嵌入特征空间中训练的模型,我们提出的框架,基于最优传输的电子病历迁移学习(OTTEHR),结合了数据的特征嵌入和不平衡最优传输(OT),以适应另一个群体的领域。为了验证我们的方法,我们使用ICD代码和多个标签对MIMIC-III和MIMIC-IV数据库进行处理并将其划分为多个种群组。结果:我们推导了方法泛化误差的理论边界,并从Wasserstein距离、源域和目标域之间的不平衡以及标记分歧等方面对其进行了解释,可以作为评估二元分类和回归任务适用性的指导。总的来说,在各种任务上,与标准迁移学习方法和机器学习迁移学习方法相比,我们的方法获得了更好的精度和计算效率。在对不同保险计划的人群测试我们的方法后,我们发现各组之间住院时间的不同程度的差异。讨论和结论:通过利用OT理论的工具,我们提出的框架允许比较不同人群之间电子病历数据的统计模型。作为临床决策的潜在应用,我们量化了不同人群之间的治疗差异。未来的方向包括将OTTEHR应用于更广泛的回归和分类任务,并将该方法扩展到半监督学习。
{"title":"Transport-based transfer learning on Electronic Health Records: application to detection of treatment disparities.","authors":"Wanxin Li, Saad Ahmed, Yongjin P Park, Khanh Dao Duc","doi":"10.1093/jamia/ocaf134","DOIUrl":"10.1093/jamia/ocaf134","url":null,"abstract":"<p><strong>Objectives: </strong>Electronic Health Records (EHRs) sampled from different populations can introduce unwanted biases, limit individual-level data sharing, and make the data and fitted model hardly transferable across different population groups. In this context, our main goal is to design an effective method to transfer knowledge between population groups, with computable guarantees for suitability, and that can be applied to quantify treatment disparities.</p><p><strong>Materials and methods: </strong>For a model trained in an embedded feature space of one subgroup, our proposed framework, Optimal Transport-based Transfer Learning for EHRs (OTTEHR), combines feature embedding of the data and unbalanced optimal transport (OT) for domain adaptation to another population group. To test our method, we processed and divided the MIMIC-III and MIMIC-IV databases into multiple population groups using ICD codes and multiple labels.</p><p><strong>Results: </strong>We derive a theoretical bound for the generalization error of our method, and interpret it in terms of the Wasserstein distance, unbalancedness between the source and target domains, and labeling divergence, which can be used as a guide for assessing the suitability of binary classification and regression tasks. In general, our method achieves better accuracy and computational efficiency compared with standard and machine learning transfer learning methods on various tasks. Upon testing our method for populations with different insurance plans, we detect various levels of disparities in hospital duration stay between groups.</p><p><strong>Discussion and conclusion: </strong>By leveraging tools from OT theory, our proposed framework allows to compare statistical models on EHR data between different population groups. As a potential application for clinical decision making, we quantify treatment disparities between different population groups. Future directions include applying OTTEHR to broader regression and classification tasks and extending the method to semi-supervised learning.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"15-25"},"PeriodicalIF":4.6,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758479/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1