Leveraging multivariate analysis and adjusted mutual information to improve stroke prediction and interpretability.

IF 1.2 4区 医学 Q4 CLINICAL NEUROLOGY Neurosciences Pub Date : 2024-07-01 DOI:10.17712/nsj.2024.3.20230100
Moutasem S Aboonq, Saeed A Alqahtani
{"title":"Leveraging multivariate analysis and adjusted mutual information to improve stroke prediction and interpretability.","authors":"Moutasem S Aboonq, Saeed A Alqahtani","doi":"10.17712/nsj.2024.3.20230100","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To develop a machine learning model to accurately predict stroke risk based on demographic and clinical data. It also sought to identify the most significant stroke risk factors and determine the optimal machine learning algorithm for stroke prediction.</p><p><strong>Methods: </strong>This cross-sectional study analyzed data on 438,693 adults from the 2021 Behavioral Risk Factor Surveillance System. Features encompassed demographics and clinical factors. Descriptive analysis profiled the dataset. Logistic regression quantified risk relationships. Adjusted mutual information evaluated feature importance. Multiple machine learning models were built and evaluated on metrics like accuracy, AUC ROC, and F1 score.</p><p><strong>Results: </strong>Key factors significantly associated with higher stroke odds included older age, diabetes, hypertension, high cholesterol, and history of myocardial infarction or angina. Random forest model achieved the best performance with accuracy of 72.46%, AUC ROC of 0.72, and F1 score of 0.74. Cross-validation confirmed its reliability. Top features were hypertension, myocardial infarction history, angina, age, diabetes status, and cholesterol.</p><p><strong>Conclusion: </strong>The random forest model robustly predicted stroke risk using demographic and clinical variables. Feature importance highlighted priorities like hypertension and diabetes for clinical monitoring and intervention. This could help enable data-driven stroke prevention strategies.</p>","PeriodicalId":19284,"journal":{"name":"Neurosciences","volume":"29 3","pages":"190-196"},"PeriodicalIF":1.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11305345/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurosciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.17712/nsj.2024.3.20230100","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To develop a machine learning model to accurately predict stroke risk based on demographic and clinical data. It also sought to identify the most significant stroke risk factors and determine the optimal machine learning algorithm for stroke prediction.

Methods: This cross-sectional study analyzed data on 438,693 adults from the 2021 Behavioral Risk Factor Surveillance System. Features encompassed demographics and clinical factors. Descriptive analysis profiled the dataset. Logistic regression quantified risk relationships. Adjusted mutual information evaluated feature importance. Multiple machine learning models were built and evaluated on metrics like accuracy, AUC ROC, and F1 score.

Results: Key factors significantly associated with higher stroke odds included older age, diabetes, hypertension, high cholesterol, and history of myocardial infarction or angina. Random forest model achieved the best performance with accuracy of 72.46%, AUC ROC of 0.72, and F1 score of 0.74. Cross-validation confirmed its reliability. Top features were hypertension, myocardial infarction history, angina, age, diabetes status, and cholesterol.

Conclusion: The random forest model robustly predicted stroke risk using demographic and clinical variables. Feature importance highlighted priorities like hypertension and diabetes for clinical monitoring and intervention. This could help enable data-driven stroke prevention strategies.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用多变量分析和调整后的互信息改进中风预测和可解释性。
目的开发一种机器学习模型,根据人口统计学和临床数据准确预测中风风险。研究还试图确定最重要的中风风险因素,并确定预测中风的最佳机器学习算法:这项横断面研究分析了 2021 年行为风险因素监测系统中 438,693 名成年人的数据。特征包括人口统计学和临床因素。描述性分析对数据集进行了剖析。逻辑回归量化了风险关系。调整后的互信息评估了特征的重要性。建立了多个机器学习模型,并对准确率、AUC ROC 和 F1 分数等指标进行了评估:结果:与中风几率较高明显相关的主要因素包括年龄较大、糖尿病、高血压、高胆固醇以及心肌梗死或心绞痛病史。随机森林模型的准确率为 72.46%,AUC ROC 为 0.72,F1 得分为 0.74,表现最佳。交叉验证证实了其可靠性。高血压、心肌梗塞病史、心绞痛、年龄、糖尿病状态和胆固醇是最主要的特征:随机森林模型利用人口统计学和临床变量稳健地预测了中风风险。特征的重要性突出了临床监测和干预的重点,如高血压和糖尿病。这有助于制定数据驱动的中风预防策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Neurosciences
Neurosciences 医学-临床神经学
CiteScore
1.40
自引率
0.00%
发文量
54
审稿时长
4.5 months
期刊介绍: Neurosciences is an open access, peer-reviewed, quarterly publication. Authors are invited to submit for publication articles reporting original work related to the nervous system, e.g., neurology, neurophysiology, neuroradiology, neurosurgery, neurorehabilitation, neurooncology, neuropsychiatry, and neurogenetics, etc. Basic research withclear clinical implications will also be considered. Review articles of current interest and high standard are welcomed for consideration. Prospective workshould not be backdated. There are also sections for Case Reports, Brief Communication, Correspondence, and medical news items. To promote continuous education, training, and learning, we include Clinical Images and MCQ’s. Highlights of international and regional meetings of interest, and specialized supplements will also be considered. All submissions must conform to the Uniform Requirements.
期刊最新文献
A rare case of pituitary dysfunction with Moyamoya disease. Assessment of social stigma among multiple sclerosis patients in Saudi Arabia: A cross-sectional study. Clinical outcomes of optic neuritis: A retrospective study at a tertiary medical center in Saudi Arabia. Comment on: Critically ill neuropathy alone is sufficient to explain proximal limb weakness and femoral nerve damage in severe SARS-CoV-2 infection. Comment on: Outcomes and complications of patients with cerebral venous thrombosis: a retrospective study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1