{"title":"Machine learning models including patient-reported outcome data in oncology: a systematic literature review and analysis of their reporting quality.","authors":"Daniela Krepper, Matteo Cesari, Niclas J Hubel, Philipp Zelger, Monika J Sztankay","doi":"10.1186/s41687-024-00808-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To critically examine the current state of machine learning (ML) models including patient-reported outcome measure (PROM) scores in cancer research, by investigating the reporting quality of currently available studies and proposing areas of improvement for future use of ML in the field.</p><p><strong>Methods: </strong>PubMed and Web of Science were systematically searched for publications of studies on patients with cancer applying ML models with PROM scores as either predictors or outcomes. The reporting quality of applied ML models was assessed utilizing an adapted version of the MI-CLAIM (Minimum Information about CLinical Artificial Intelligence Modelling) checklist. The key variables of the checklist are study design, data preparation, model development, optimization, performance, and examination. Reproducibility and transparency complement the reporting quality criteria.</p><p><strong>Results: </strong>The literature search yielded 1634 hits, of which 52 (3.2%) were eligible. Thirty-six (69.2%) publications included PROM scores as a predictor and 32 (61.5%) as an outcome. Results of the reporting quality appraisal indicate a potential for improvement, especially in the areas of model examination. According to the standards of the MI-CLAIM checklist, the reporting quality of ML models in included studies proved to be low. Only nine (17.3%) publications present a discussion about the clinical applicability of the developed model and reproducibility and only three (5.8%) provide a code to reproduce the model and the results.</p><p><strong>Conclusion: </strong>The herein performed critical examination of the status quo of the application of ML models including PROM scores in published oncological studies allowed the identification of areas of improvement for reporting and future use of ML in the field.</p>","PeriodicalId":36660,"journal":{"name":"Journal of Patient-Reported Outcomes","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11538124/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Patient-Reported Outcomes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41687-024-00808-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To critically examine the current state of machine learning (ML) models including patient-reported outcome measure (PROM) scores in cancer research, by investigating the reporting quality of currently available studies and proposing areas of improvement for future use of ML in the field.
Methods: PubMed and Web of Science were systematically searched for publications of studies on patients with cancer applying ML models with PROM scores as either predictors or outcomes. The reporting quality of applied ML models was assessed utilizing an adapted version of the MI-CLAIM (Minimum Information about CLinical Artificial Intelligence Modelling) checklist. The key variables of the checklist are study design, data preparation, model development, optimization, performance, and examination. Reproducibility and transparency complement the reporting quality criteria.
Results: The literature search yielded 1634 hits, of which 52 (3.2%) were eligible. Thirty-six (69.2%) publications included PROM scores as a predictor and 32 (61.5%) as an outcome. Results of the reporting quality appraisal indicate a potential for improvement, especially in the areas of model examination. According to the standards of the MI-CLAIM checklist, the reporting quality of ML models in included studies proved to be low. Only nine (17.3%) publications present a discussion about the clinical applicability of the developed model and reproducibility and only three (5.8%) provide a code to reproduce the model and the results.
Conclusion: The herein performed critical examination of the status quo of the application of ML models including PROM scores in published oncological studies allowed the identification of areas of improvement for reporting and future use of ML in the field.
目的:通过调查当前可用研究的报告质量,批判性地审视包括癌症研究中患者报告结果测量(PROM)评分在内的机器学习(ML)模型的现状,并提出未来在该领域使用 ML 的改进领域:方法:在 PubMed 和 Web of Science 上系统地搜索了有关癌症患者的研究出版物,这些研究应用了以 PROM 评分作为预测因子或结果的 ML 模型。采用改编版的 MI-CLAIM(Minimum Information about CLinical Artificial Intelligence Modelling,临床人工智能建模最低信息)核对表对应用 ML 模型的报告质量进行评估。该清单的关键变量包括研究设计、数据准备、模型开发、优化、性能和检查。可重复性和透明度是报告质量标准的补充:文献检索共搜索到 1634 篇文献,其中 52 篇(3.2%)符合条件。36篇(69.2%)文献将 PROM 评分作为预测指标,32 篇(61.5%)文献将 PROM 评分作为结果指标。报告质量评估结果表明,尤其是在模型检查方面有改进的可能。根据 MI-CLAIM 检查表的标准,所纳入研究中 ML 模型的报告质量较低。只有 9 篇(17.3%)论文讨论了所开发模型的临床适用性和可重复性,只有 3 篇(5.8%)论文提供了重现模型和结果的代码:本文对已发表的肿瘤研究中应用 ML 模型(包括 PROM 评分)的现状进行了批判性审查,从而确定了该领域报告和未来使用 ML 的改进领域。