Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.

IF 3.9 3区 医学 Q1 HEALTH CARE SCIENCES & SERVICES BMC Medical Research Methodology Pub Date : 2024-08-30 DOI:10.1186/s12874-024-02305-3
Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi
{"title":"Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.","authors":"Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi","doi":"10.1186/s12874-024-02305-3","DOIUrl":null,"url":null,"abstract":"<p><p>Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"191"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11363416/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-024-02305-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
应用于乳腺癌存活率的机器学习估算方法的多指标比较。
处理临床预后研究中的缺失数据是一项至关重要但又极具挑战性的任务。本研究旨在从各种分析角度全面评估不同机器学习(ML)归因方法的有效性和可靠性。具体来说,研究重点关注用于评估机器学习归因方法的三类不同的性能指标:回归估计的输入后偏差、输入后预测准确性和实质性无模型指标。作为示例,我们应用了一项真实世界乳腺癌生存研究的数据。这种综合方法旨在从各种分析角度全面评估 ML 估算方法的有效性和可靠性。我们使用了一个包含 30% 随机缺失 (MAR) 值的模拟数据集。评估了一些单项归因(SI)方法(特别是 KNN、missMDA、CART、missForest、missRanger、missCforest)和多项归因(MI)方法(特别是 miceCART 和 miceRF)。使用的性能指标包括高尔距离、估计偏差、经验标准误差、覆盖率、置信区间长度、预测准确率、误分类比例(PFC)、归一化均方根误差(NRMSE)、AUC 和 C 指数得分。分析表明,就高尔距离而言,CART 和 missForest 的准确度最高,而 missMDA 和 CART 在二元协变量方面表现出色;对于连续协变量,missForest 和 miceCART 则更胜一筹。在评估回归估计的偏差和准确性时,miceCART 和 miceRF 的偏差最小。总体而言,各种估算方法的效率都高于全病例分析(CCA),其中 MICE 方法的置信区间覆盖率最佳。就 Cox 模型的预测准确性而言,missMDA 和 missForest 的 AUC 和 C-index 得分更高。研究发现,尽管 SI 方法具有更好的预测准确性,但与 MI 方法相比,SI 方法在回归系数中引入了更多偏差。这项研究强调了在时间到事件研究中根据研究目标和数据类型选择适当归因方法的重要性。在所研究的不同性能指标中,各种方法的有效性各不相同,这凸显了在多重归因框架内使用先进的机器学习算法来提高研究完整性和研究结果稳健性的价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Medical Research Methodology
BMC Medical Research Methodology 医学-卫生保健
CiteScore
6.50
自引率
2.50%
发文量
298
审稿时长
3-8 weeks
期刊介绍: BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.
期刊最新文献
The role of the estimand framework in the analysis of patient-reported outcomes in single-arm trials: a case study in oncology. Cardinality matching versus propensity score matching for addressing cluster-level residual confounding in implantable medical device and surgical epidemiology: a parametric and plasmode simulation study. Establishing a machine learning dementia progression prediction model with multiple integrated data. Correction: Forced randomization: the what, why, and how. Three new methodologies for calculating the effective sample size when performing population adjustment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1