Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES BMC Medical Research Methodology Pub Date : 2024-08-30 DOI:10.1186/s12874-024-02305-3

Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi

{"title":"Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.","authors":"Imad El Badisy, Nathalie Graffeo, Mohamed Khalis, Roch Giorgi","doi":"10.1186/s12874-024-02305-3","DOIUrl":null,"url":null,"abstract":"<p><p>Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"191"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11363416/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-024-02305-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

应用于乳腺癌存活率的机器学习估算方法的多指标比较。

处理临床预后研究中的缺失数据是一项至关重要但又极具挑战性的任务。本研究旨在从各种分析角度全面评估不同机器学习（ML）归因方法的有效性和可靠性。具体来说，研究重点关注用于评估机器学习归因方法的三类不同的性能指标：回归估计的输入后偏差、输入后预测准确性和实质性无模型指标。作为示例，我们应用了一项真实世界乳腺癌生存研究的数据。这种综合方法旨在从各种分析角度全面评估 ML 估算方法的有效性和可靠性。我们使用了一个包含 30% 随机缺失 (MAR) 值的模拟数据集。评估了一些单项归因（SI）方法（特别是 KNN、missMDA、CART、missForest、missRanger、missCforest）和多项归因（MI）方法（特别是 miceCART 和 miceRF）。使用的性能指标包括高尔距离、估计偏差、经验标准误差、覆盖率、置信区间长度、预测准确率、误分类比例（PFC）、归一化均方根误差（NRMSE）、AUC 和 C 指数得分。分析表明，就高尔距离而言，CART 和 missForest 的准确度最高，而 missMDA 和 CART 在二元协变量方面表现出色；对于连续协变量，missForest 和 miceCART 则更胜一筹。在评估回归估计的偏差和准确性时，miceCART 和 miceRF 的偏差最小。总体而言，各种估算方法的效率都高于全病例分析（CCA），其中 MICE 方法的置信区间覆盖率最佳。就 Cox 模型的预测准确性而言，missMDA 和 missForest 的 AUC 和 C-index 得分更高。研究发现，尽管 SI 方法具有更好的预测准确性，但与 MI 方法相比，SI 方法在回归系数中引入了更多偏差。这项研究强调了在时间到事件研究中根据研究目标和数据类型选择适当归因方法的重要性。在所研究的不同性能指标中，各种方法的有效性各不相同，这凸显了在多重归因框架内使用先进的机器学习算法来提高研究完整性和研究结果稳健性的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC Medical Research Methodology 医学-卫生保健

CiteScore

6.50

自引率

2.50%

发文量

298

审稿时长

3-8 weeks

期刊介绍： BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.