整合无监督和监督学习技术预测创伤性脑损伤:一项基于人群的研究

Suvd Zulbayar , Tatyana Mollayeva , Angela Colantonio , Vincy Chan , Michael Escobar
{"title":"整合无监督和监督学习技术预测创伤性脑损伤:一项基于人群的研究","authors":"Suvd Zulbayar ,&nbsp;Tatyana Mollayeva ,&nbsp;Angela Colantonio ,&nbsp;Vincy Chan ,&nbsp;Michael Escobar","doi":"10.1016/j.ibmed.2023.100118","DOIUrl":null,"url":null,"abstract":"<div><p>This work aimed to identify pre-existing health conditions of patients with traumatic brain injury (TBI) and develop predictive models for the first TBI event and its external causes by employing a combination of unsupervised and supervised learning algorithms. We acquired up to five years of pre-injury diagnoses for 488,107 patients with TBI and 488,107 matched control patients who entered the emergency department or acute care hospitals between April 1st, 2002, and March 31st, 2020. Diagnoses were obtained from the Ontario Health Insurance Plan (OHIP) database which contains province-wide claims data by physicians in Ontario, Canada for inpatient and outpatient services. A screening process was conducted on the OHIP diagnostic codes to limit the subsequent analysis to codes that were predictive of TBI, which concluded that 314 codes were significantly associated with TBI. The Latent Dirichlet Allocation (LDA) model was applied to the diagnostic codes and generated an optimal number of 19 topics that concur with published literature but also suggest other unexplored areas. Estimated word-topic probabilities from the LDA model helped us detect pre-morbid conditions among patients with TBI by uncovering the underlying patterns of diagnoses, meanwhile estimated document-topic probabilities were utilized in variable creation as form of a dimension reduction. We created 19 topic scores for each patient in the cohort which were utilized along with socio-demographic factors for Random Forest binary classifier models. Test set performances evaluated using area under the receiver operating characteristic curve (AUC) were: TBI event (AUC = 0.85), external cause of injury: falls (AUC = 0.85), struck by/against (AUC = 0.83), cyclist collision (AUC = 0.76), motor vehicle collision (AUC = 0.83). Our analysis successfully demonstrated the feasibility of using machine learning to predict TBI due to various external causes and identified the most important factors that contribute to this prediction.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"8 ","pages":"Article 100118"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521223000327/pdfft?md5=7a277cf7b43235bd2edae27f0d38c38d&pid=1-s2.0-S2666521223000327-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Integrating unsupervised and supervised learning techniques to predict traumatic brain injury: A population-based study\",\"authors\":\"Suvd Zulbayar ,&nbsp;Tatyana Mollayeva ,&nbsp;Angela Colantonio ,&nbsp;Vincy Chan ,&nbsp;Michael Escobar\",\"doi\":\"10.1016/j.ibmed.2023.100118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This work aimed to identify pre-existing health conditions of patients with traumatic brain injury (TBI) and develop predictive models for the first TBI event and its external causes by employing a combination of unsupervised and supervised learning algorithms. We acquired up to five years of pre-injury diagnoses for 488,107 patients with TBI and 488,107 matched control patients who entered the emergency department or acute care hospitals between April 1st, 2002, and March 31st, 2020. Diagnoses were obtained from the Ontario Health Insurance Plan (OHIP) database which contains province-wide claims data by physicians in Ontario, Canada for inpatient and outpatient services. A screening process was conducted on the OHIP diagnostic codes to limit the subsequent analysis to codes that were predictive of TBI, which concluded that 314 codes were significantly associated with TBI. The Latent Dirichlet Allocation (LDA) model was applied to the diagnostic codes and generated an optimal number of 19 topics that concur with published literature but also suggest other unexplored areas. Estimated word-topic probabilities from the LDA model helped us detect pre-morbid conditions among patients with TBI by uncovering the underlying patterns of diagnoses, meanwhile estimated document-topic probabilities were utilized in variable creation as form of a dimension reduction. We created 19 topic scores for each patient in the cohort which were utilized along with socio-demographic factors for Random Forest binary classifier models. Test set performances evaluated using area under the receiver operating characteristic curve (AUC) were: TBI event (AUC = 0.85), external cause of injury: falls (AUC = 0.85), struck by/against (AUC = 0.83), cyclist collision (AUC = 0.76), motor vehicle collision (AUC = 0.83). Our analysis successfully demonstrated the feasibility of using machine learning to predict TBI due to various external causes and identified the most important factors that contribute to this prediction.</p></div>\",\"PeriodicalId\":73399,\"journal\":{\"name\":\"Intelligence-based medicine\",\"volume\":\"8 \",\"pages\":\"Article 100118\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666521223000327/pdfft?md5=7a277cf7b43235bd2edae27f0d38c38d&pid=1-s2.0-S2666521223000327-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligence-based medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666521223000327\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521223000327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究旨在识别创伤性脑损伤(TBI)患者的既往健康状况,并通过采用无监督和有监督学习算法的结合,开发首次TBI事件及其外部原因的预测模型。我们对2002年4月1日至2020年3月31日期间进入急诊科或急性护理医院的488,107名TBI患者和488,107名匹配的对照患者进行了长达5年的损伤前诊断。诊断结果来自安大略省健康保险计划(OHIP)数据库,该数据库包含加拿大安大略省医生对住院和门诊服务的全省索赔数据。对OHIP诊断代码进行筛选,将后续分析限制在预测TBI的代码上,结论是314个代码与TBI显着相关。将潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)模型应用于诊断代码,并生成了19个主题的最佳数量,这些主题与已发表的文献一致,但也暗示了其他未探索的领域。来自LDA模型的估计词-主题概率通过揭示诊断的潜在模式帮助我们检测TBI患者的发病前状况,同时估计的文档-主题概率作为降维的形式用于变量创建。我们为队列中的每位患者创建了19个主题评分,并将其与社会人口统计学因素一起用于随机森林二元分类器模型。使用受试者工作特征曲线下面积(AUC)评价的测试集性能为:TBI事件(AUC = 0.85)、外因损伤:跌倒(AUC = 0.85)、被撞击(AUC = 0.83)、骑自行车者碰撞(AUC = 0.76)、机动车碰撞(AUC = 0.83)。我们的分析成功地证明了使用机器学习预测由各种外部原因引起的TBI的可行性,并确定了促成这一预测的最重要因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Integrating unsupervised and supervised learning techniques to predict traumatic brain injury: A population-based study

This work aimed to identify pre-existing health conditions of patients with traumatic brain injury (TBI) and develop predictive models for the first TBI event and its external causes by employing a combination of unsupervised and supervised learning algorithms. We acquired up to five years of pre-injury diagnoses for 488,107 patients with TBI and 488,107 matched control patients who entered the emergency department or acute care hospitals between April 1st, 2002, and March 31st, 2020. Diagnoses were obtained from the Ontario Health Insurance Plan (OHIP) database which contains province-wide claims data by physicians in Ontario, Canada for inpatient and outpatient services. A screening process was conducted on the OHIP diagnostic codes to limit the subsequent analysis to codes that were predictive of TBI, which concluded that 314 codes were significantly associated with TBI. The Latent Dirichlet Allocation (LDA) model was applied to the diagnostic codes and generated an optimal number of 19 topics that concur with published literature but also suggest other unexplored areas. Estimated word-topic probabilities from the LDA model helped us detect pre-morbid conditions among patients with TBI by uncovering the underlying patterns of diagnoses, meanwhile estimated document-topic probabilities were utilized in variable creation as form of a dimension reduction. We created 19 topic scores for each patient in the cohort which were utilized along with socio-demographic factors for Random Forest binary classifier models. Test set performances evaluated using area under the receiver operating characteristic curve (AUC) were: TBI event (AUC = 0.85), external cause of injury: falls (AUC = 0.85), struck by/against (AUC = 0.83), cyclist collision (AUC = 0.76), motor vehicle collision (AUC = 0.83). Our analysis successfully demonstrated the feasibility of using machine learning to predict TBI due to various external causes and identified the most important factors that contribute to this prediction.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Intelligence-based medicine
Intelligence-based medicine Health Informatics
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
187 days
期刊最新文献
Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance Automatic characterization of cerebral MRI images for the detection of autism spectrum disorders DOTnet 2.0: Deep learning network for diffuse optical tomography image reconstruction Artificial intelligence in child development monitoring: A systematic review on usage, outcomes and acceptance Clustering polycystic ovary syndrome laboratory results extracted from a large internet forum with machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1