Wei Feng , Honghan Wu , Hui Ma , Yuechuchu Yin , Zhenhuan Tao , Shan Lu , Xin Zhang , Yun Yu , Cheng Wan , Yun Liu
{"title":"Deep learning based prediction of depression and anxiety in patients with type 2 diabetes mellitus using regional electronic health records","authors":"Wei Feng , Honghan Wu , Hui Ma , Yuechuchu Yin , Zhenhuan Tao , Shan Lu , Xin Zhang , Yun Yu , Cheng Wan , Yun Liu","doi":"10.1016/j.ijmedinf.2025.105801","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Depression and anxiety are prevalent mental health conditions among individuals with type 2 diabetes mellitus (T2DM), who exhibit unique vulnerabilities and etiologies. However, existing approaches fail to fully utilize regional heterogeneous electronic health record (EHR) data. Integrating this data can provide a more comprehensive understanding of depression and anxiety in T2DM patients, leading to more personalized treatment strategies.</div></div><div><h3>Objective</h3><div>This study aims to develop and validate a deep learning model, the Regional EHR for Depression and Anxiety Prediction Model (REDAPM), using regional EHR data to predict depression and anxiety in patients with T2DM.</div></div><div><h3>Methods</h3><div>A case-control development and validation study was conducted using regional EHR data from the Nanjing Health Information Center (NHIC). Two retrospective, matched (1:3) datasets were constructed from the full cohort for the model's internal and external validation. These two datasets were selected from the NHIC data of 2020 and 2022, respectively. The REDAPM incorporates both structured and unstructured EHR data, capturing the temporal dependency of clinical events. The performance of REDAPM was compared to a set of baseline models, evaluated using the area under the receiver operating characteristic curve (ROC-AUC) and the area under the precision-recall curve (PR-AUC). Subgroup, ablation, and interpretation analyses were conducted to identify relevant clinical features available from EHRs.</div></div><div><h3>Results</h3><div>The internal and external validation datasets comprised 24,724 and 34,340 patients, respectively. The REDAPM outperformed baseline models in both datasets, achieving ROC-AUC scores of 0.9029±0.008 and 0.7360±0.005, and PR-AUC scores of 0.8124±0.011 and 0.5504±0.009. Ablation and subgroup experiments confirmed the significant contribution of patients' medical history text to the model's performance. Integrated gradient score analysis identified the predictive importance of other mental disorders.</div></div><div><h3>Conclusion</h3><div>The REDAPM effectively leverages the heterogeneous characteristics of regional EHR data, demonstrating strong predictive performance for depression onset in diabetic patients. It also shows potential for discovering significant clinical features, indicating considerable promise for clinical utility.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"196 ","pages":"Article 105801"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625000188","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Depression and anxiety are prevalent mental health conditions among individuals with type 2 diabetes mellitus (T2DM), who exhibit unique vulnerabilities and etiologies. However, existing approaches fail to fully utilize regional heterogeneous electronic health record (EHR) data. Integrating this data can provide a more comprehensive understanding of depression and anxiety in T2DM patients, leading to more personalized treatment strategies.
Objective
This study aims to develop and validate a deep learning model, the Regional EHR for Depression and Anxiety Prediction Model (REDAPM), using regional EHR data to predict depression and anxiety in patients with T2DM.
Methods
A case-control development and validation study was conducted using regional EHR data from the Nanjing Health Information Center (NHIC). Two retrospective, matched (1:3) datasets were constructed from the full cohort for the model's internal and external validation. These two datasets were selected from the NHIC data of 2020 and 2022, respectively. The REDAPM incorporates both structured and unstructured EHR data, capturing the temporal dependency of clinical events. The performance of REDAPM was compared to a set of baseline models, evaluated using the area under the receiver operating characteristic curve (ROC-AUC) and the area under the precision-recall curve (PR-AUC). Subgroup, ablation, and interpretation analyses were conducted to identify relevant clinical features available from EHRs.
Results
The internal and external validation datasets comprised 24,724 and 34,340 patients, respectively. The REDAPM outperformed baseline models in both datasets, achieving ROC-AUC scores of 0.9029±0.008 and 0.7360±0.005, and PR-AUC scores of 0.8124±0.011 and 0.5504±0.009. Ablation and subgroup experiments confirmed the significant contribution of patients' medical history text to the model's performance. Integrated gradient score analysis identified the predictive importance of other mental disorders.
Conclusion
The REDAPM effectively leverages the heterogeneous characteristics of regional EHR data, demonstrating strong predictive performance for depression onset in diabetic patients. It also shows potential for discovering significant clinical features, indicating considerable promise for clinical utility.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.