Domain generalization for enhanced predictions of hospital readmission on unseen domains among patients with diabetes

IF 6.1 2区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence in Medicine Pub Date : 2024-11-10 DOI:10.1016/j.artmed.2024.103010

Ameen Abdel Hai , Mark G. Weiner , Alice Livshits , Jeremiah R. Brown , Anuradha Paranjape , Wenke Hwang , Lester H. Kirchner , Nestoras Mathioudakis , Esra Karslioglu French , Zoran Obradovic , Daniel J. Rubin

{"title":"Domain generalization for enhanced predictions of hospital readmission on unseen domains among patients with diabetes","authors":"Ameen Abdel Hai , Mark G. Weiner , Alice Livshits , Jeremiah R. Brown , Anuradha Paranjape , Wenke Hwang , Lester H. Kirchner , Nestoras Mathioudakis , Esra Karslioglu French , Zoran Obradovic , Daniel J. Rubin","doi":"10.1016/j.artmed.2024.103010","DOIUrl":null,"url":null,"abstract":"<div><div>A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss.</div><div>A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"158 ","pages":"Article 103010"},"PeriodicalIF":6.1000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002525","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss.

A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过领域归纳增强对糖尿病患者未见领域的再入院预测能力

评估再入院风险的预测模型对于识别可能从额外护理中受益的患者很有价值。使用本地数据开发针对特定医院的再入院风险预测模型对许多机构来说并不可行。根据一家医院的数据开发的模型可能无法很好地推广到另一家医院。目前还缺乏一种端到端可调整的再入院模型，这种模型可以推广到未见过的测试领域。我们提出了一种用于跨领域知识转移的早期再入院风险领域泛化网络（ERR-DGN）。ERR-DGN将源领域中一致的共享模式和特征内化，使其能够适应新领域。它将源数据集转换到一个共同的嵌入空间，同时捕捉连续数据的相关时间长期依赖关系。然后将领域泛化应用于特定领域的全连接线性层。我们利用城市、郊区、农村和混合医院系统中 201,688 名糖尿病患者的电子健康记录（EHR）数据开发了一个模型，以提高一家农村医院中 67,066 名未见过的糖尿病患者的 30 天再入院预测。我们还探讨了模型性能随地点数量和时间的变化而变化的情况。拟议方法的性能优于基线模型，F1 分数提高了 6%（0.79 ± 0.006 vs. 0.73 ± 0.007）。加入三个站点后，模型性能达到顶峰。该模型的性能在 3 年中相对稳定，但在 4 年中有所下降。ERR-DGN可能是从多个地点学习数据，然后将住院再入院预测模型应用到新地点的熟练工具。纳入相对较少数量的不同地点可能就足以达到最佳性能。至少每 3 年进行一次定期再训练可减轻模型随时间推移而退化的情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Artificial Intelligence in Medicine 工程技术-工程：生物医学

CiteScore

15.00

自引率

2.70%

发文量

143

审稿时长

6.3 months

期刊介绍： Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care. Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.