Sivan Bershan , Andreas Meisel , Philipp Mergenthaler
{"title":"Data-driven explainable machine learning for personalized risk classification of myasthenic crisis","authors":"Sivan Bershan , Andreas Meisel , Philipp Mergenthaler","doi":"10.1016/j.ijmedinf.2024.105679","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Myasthenic crisis (MC) is a critical progression of Myasthenia gravis (MG), requiring intensive care treatment and invasive therapies. Classifying patients at high-risk for MC facilitates treatment decisions such as changes in medication or the need for mechanical ventilation and helps prevent disease progression by decreasing treatment-induced stress on the patient. Here, we investigated whether it is possible to reliably classify MG patients into groups at low or high risk of MC based entirely on routine medical data using explainable machine learning (ML).</div></div><div><h3>Methods</h3><div>In this single-center pseudo-prospective cohort study, we investigated the precision of ML models trained with real-world routine clinical data to identify MG patients at risk for MC, and identified explainable distinctive features for the groups. 51 MG patients, including 13 MC, were used for model training based on real-world clinical data available from the hospital management system. Patients were classified to high or low risk for MC using Lasso regression or random forest ML models.</div></div><div><h3>Results</h3><div>The mean cross-validated AUC classifying MG patients as high or low risk for MC based on simple or compound features derived from real-world clinical data showed a predictive accuracy of 68.8% for a regularized Lasso regression and 76.5% for a random forest model. Studying feature importance across 5100 model runs identified explainable features to distinguish MG patients at high or low risk for MC. Feature importance scores suggested that multimorbidity may play a role in risk classification.</div></div><div><h3>Conclusion</h3><div>This study establishes feasibility and proof-of-concept for risk classification of MC based on real-world routine clinical data using ML with explainable features and variance control at the point of care. Future research on ML-based prediction of MC should include multi-center, multinational data collection, more in-depth data per patient, more patients, and an attention-based ML model to include free-text.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"194 ","pages":"Article 105679"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624003423","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Myasthenic crisis (MC) is a critical progression of Myasthenia gravis (MG), requiring intensive care treatment and invasive therapies. Classifying patients at high-risk for MC facilitates treatment decisions such as changes in medication or the need for mechanical ventilation and helps prevent disease progression by decreasing treatment-induced stress on the patient. Here, we investigated whether it is possible to reliably classify MG patients into groups at low or high risk of MC based entirely on routine medical data using explainable machine learning (ML).
Methods
In this single-center pseudo-prospective cohort study, we investigated the precision of ML models trained with real-world routine clinical data to identify MG patients at risk for MC, and identified explainable distinctive features for the groups. 51 MG patients, including 13 MC, were used for model training based on real-world clinical data available from the hospital management system. Patients were classified to high or low risk for MC using Lasso regression or random forest ML models.
Results
The mean cross-validated AUC classifying MG patients as high or low risk for MC based on simple or compound features derived from real-world clinical data showed a predictive accuracy of 68.8% for a regularized Lasso regression and 76.5% for a random forest model. Studying feature importance across 5100 model runs identified explainable features to distinguish MG patients at high or low risk for MC. Feature importance scores suggested that multimorbidity may play a role in risk classification.
Conclusion
This study establishes feasibility and proof-of-concept for risk classification of MC based on real-world routine clinical data using ML with explainable features and variance control at the point of care. Future research on ML-based prediction of MC should include multi-center, multinational data collection, more in-depth data per patient, more patients, and an attention-based ML model to include free-text.
目的:肌无力危象(MC)是重症肌无力症(MG)的一个重要进展,需要重症监护治疗和侵入性疗法。对MC高危患者进行分类有助于做出治疗决定,如更换药物或是否需要机械通气,并通过减少治疗对患者造成的压力来预防疾病进展。在此,我们利用可解释的机器学习(ML)研究了是否有可能完全根据常规医疗数据将 MG 患者可靠地分为 MC 低风险或高风险组:在这项单中心伪前瞻性队列研究中,我们研究了使用真实世界常规临床数据训练的ML模型识别MG患者MC风险的精确度,并确定了各组可解释的显著特征。根据医院管理系统提供的真实世界临床数据,对 51 名 MG 患者(包括 13 名 MC)进行了模型训练。使用 Lasso 回归或随机森林 ML 模型将患者划分为 MC 高风险或低风险:根据真实世界临床数据中的简单或复合特征将 MG 患者划分为 MC 高风险或低风险的交叉验证 AUC 平均值显示,正则化 Lasso 回归的预测准确率为 68.8%,随机森林模型的预测准确率为 76.5%。通过对 5100 次模型运行的特征重要性进行研究,确定了可用于区分 MC 高风险或低风险 MG 患者的可解释特征。特征重要性得分表明,多病性可能在风险分类中发挥作用:本研究基于真实世界的常规临床数据,利用具有可解释特征的 ML 和护理点的方差控制,建立了 MC 风险分类的可行性和概念验证。基于 ML 的 MC 预测的未来研究应包括多中心、跨国数据收集、每个患者更深入的数据、更多的患者以及基于注意力的 ML 模型(包括自由文本)。
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.