Multimodal Learning for Cardiovascular Risk Prediction using EHR Data

A. Bagheri, T. K. Groenhof, W. B. Veldhuis, P. A. Jong, F. Asselbergs, D. Oberski
{"title":"Multimodal Learning for Cardiovascular Risk Prediction using EHR Data","authors":"A. Bagheri, T. K. Groenhof, W. B. Veldhuis, P. A. Jong, F. Asselbergs, D. Oberski","doi":"10.1145/3388440.3414924","DOIUrl":null,"url":null,"abstract":"Electronic health records (EHRs) contain structured and unstructured data of significant clinical and research value. Various machine learning approaches have been developed to employ information in EHRs for risk prediction. The majority of these attempts, however, focus on structured EHR fields and lose the vast amount of information in the unstructured texts. Deep neural networks, on the other hand, gained tremendous momentum in knowledge discovery from EHR texts, while there are very seldom studies that used of both free-texts and the structured information in EHRs for clinical prediction. To exploit the potential information captured in EHRs, in this study we propose MI-BiLSTM, a multimodal bidirectional long short-term memory-based framework for cardiovascular risk prediction that integrates medical texts and structured clinical information. The MI-BiLSTM framework concatenates word embeddings from x-ray reports to classical clinical predictors from the Second Manifestations of ARTerial disease (SMART) study [1], before applying them to a final fully connected neural network. In the experiments, by employing the proposed framework, we compared performances of different deep neural network architectures on data of 5603 patients using 5-fold cross validation. Evaluated on the SMART study, we demonstrate the clinical relevance of integrating text features and classical predictors for cardiovascular risk prediction for patients with manifest vascular disease or at high--risk for cardiovascular disease. Our results show that the MI-BiLSTM framework using text data in addition to laboratory values outperforms deep learning models using only known clinical predictors. In future, we will focus on expanding our multimodal framework to import knowledge from available medical ontologies to enhance the quality of clinical decision making in risk prediction models. An open-source implementation of the proposed framework is publicly available at https://github.com/bagheria/CardioRisk-TextMining","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3414924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Electronic health records (EHRs) contain structured and unstructured data of significant clinical and research value. Various machine learning approaches have been developed to employ information in EHRs for risk prediction. The majority of these attempts, however, focus on structured EHR fields and lose the vast amount of information in the unstructured texts. Deep neural networks, on the other hand, gained tremendous momentum in knowledge discovery from EHR texts, while there are very seldom studies that used of both free-texts and the structured information in EHRs for clinical prediction. To exploit the potential information captured in EHRs, in this study we propose MI-BiLSTM, a multimodal bidirectional long short-term memory-based framework for cardiovascular risk prediction that integrates medical texts and structured clinical information. The MI-BiLSTM framework concatenates word embeddings from x-ray reports to classical clinical predictors from the Second Manifestations of ARTerial disease (SMART) study [1], before applying them to a final fully connected neural network. In the experiments, by employing the proposed framework, we compared performances of different deep neural network architectures on data of 5603 patients using 5-fold cross validation. Evaluated on the SMART study, we demonstrate the clinical relevance of integrating text features and classical predictors for cardiovascular risk prediction for patients with manifest vascular disease or at high--risk for cardiovascular disease. Our results show that the MI-BiLSTM framework using text data in addition to laboratory values outperforms deep learning models using only known clinical predictors. In future, we will focus on expanding our multimodal framework to import knowledge from available medical ontologies to enhance the quality of clinical decision making in risk prediction models. An open-source implementation of the proposed framework is publicly available at https://github.com/bagheria/CardioRisk-TextMining
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用电子病历数据进行心血管风险预测的多模式学习
电子健康记录(EHRs)包含具有重要临床和研究价值的结构化和非结构化数据。已经开发了各种机器学习方法来利用电子病历中的信息进行风险预测。然而,这些尝试中的大多数都集中在结构化的EHR字段上,而丢失了非结构化文本中的大量信息。另一方面,深度神经网络在电子病历文本的知识发现方面取得了巨大的发展势头,而将电子病历中的自由文本和结构化信息同时用于临床预测的研究却很少。为了利用电子病历中捕获的潜在信息,在本研究中,我们提出了MI-BiLSTM,这是一个基于多模式双向长短期记忆的心血管风险预测框架,整合了医学文本和结构化临床信息。MI-BiLSTM框架将来自x射线报告的词嵌入连接到来自动脉疾病第二表现(SMART)研究[1]的经典临床预测因子,然后将它们应用于最终的全连接神经网络。在实验中,采用所提出的框架,我们使用5倍交叉验证比较了不同深度神经网络架构在5603例患者数据上的性能。通过SMART研究的评估,我们证明了整合文本特征和经典预测因子对明显血管疾病或心血管疾病高风险患者的心血管风险预测的临床相关性。我们的研究结果表明,除了实验室值之外,使用文本数据的MI-BiLSTM框架优于仅使用已知临床预测因子的深度学习模型。未来,我们将专注于扩展我们的多模式框架,从现有的医学本体中导入知识,以提高风险预测模型中临床决策的质量。该框架的开源实现可在https://github.com/bagheria/CardioRisk-TextMining上公开获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RA2Vec CanMod From Interatomic Distances to Protein Tertiary Structures with a Deep Convolutional Neural Network Prediction of Large for Gestational Age Infants in Overweight and Obese Women at Approximately 20 Gestational Weeks Using Patient Information for the Prediction of Caregiver Burden in Amyotrophic Lateral Sclerosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1