Evaluation of machine learning methods for the retrospective detection of ovarian cancer recurrences from chemotherapy data

A.D. Coles , C.D. McInerney , K. Zucker , S. Cheeseman , O.A. Johnson , G. Hall
{"title":"Evaluation of machine learning methods for the retrospective detection of ovarian cancer recurrences from chemotherapy data","authors":"A.D. Coles ,&nbsp;C.D. McInerney ,&nbsp;K. Zucker ,&nbsp;S. Cheeseman ,&nbsp;O.A. Johnson ,&nbsp;G. Hall","doi":"10.1016/j.esmorw.2024.100038","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Cancer recurrences are poorly recorded within electronic health records around the world. This hinders research into the efficacy of cancer treatments. Currently, the retrospective identification of recurrence/progression diagnosis dates is achieved by staff who manually review patients’ health records. This is expensive, time-consuming, and inefficient. Machine Learning models may expedite the review of health records and facilitate the assessment of alternative cancer therapies.</p></div><div><h3>Materials and methods</h3><p>This paper evaluates the use of four machine learning models (random forests, conditional inference trees, decision trees, and logistic regression) in identifying proxy dates of epithelial ovarian cancer recurrence/progression from chemotherapy data, in 531 patients at Leeds Teaching Hospital Trust.</p></div><div><h3>Results</h3><p>The random forest achieved the highest F1 score of 0.941 (95% confidence interval 0.916-0.968) when identifying recurrence events. Both the random forest and decision tree models’ classifications closely conform to chart-reviewed time to next treatment, serving as a surrogate for recurrence-free survival. Additionally, all models reached an F1 score &gt;0.940 when identifying patients whose cancer recurred/progressed.</p></div><div><h3>Conclusions</h3><p>Our models proficiently identify both proxy dates for recurrence/progression diagnoses and patients whose cancer recurred/progressed. Considering the similar performance of the random forest and decision tree, model preference should be determined by the interpretability required to assist chart review and the ease of implementation into existing architecture.</p></div>","PeriodicalId":100491,"journal":{"name":"ESMO Real World Data and Digital Oncology","volume":"4 ","pages":"Article 100038"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294982012400016X/pdfft?md5=037748083c08b03abbc66eb0cbc15421&pid=1-s2.0-S294982012400016X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESMO Real World Data and Digital Oncology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294982012400016X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Cancer recurrences are poorly recorded within electronic health records around the world. This hinders research into the efficacy of cancer treatments. Currently, the retrospective identification of recurrence/progression diagnosis dates is achieved by staff who manually review patients’ health records. This is expensive, time-consuming, and inefficient. Machine Learning models may expedite the review of health records and facilitate the assessment of alternative cancer therapies.

Materials and methods

This paper evaluates the use of four machine learning models (random forests, conditional inference trees, decision trees, and logistic regression) in identifying proxy dates of epithelial ovarian cancer recurrence/progression from chemotherapy data, in 531 patients at Leeds Teaching Hospital Trust.

Results

The random forest achieved the highest F1 score of 0.941 (95% confidence interval 0.916-0.968) when identifying recurrence events. Both the random forest and decision tree models’ classifications closely conform to chart-reviewed time to next treatment, serving as a surrogate for recurrence-free survival. Additionally, all models reached an F1 score >0.940 when identifying patients whose cancer recurred/progressed.

Conclusions

Our models proficiently identify both proxy dates for recurrence/progression diagnoses and patients whose cancer recurred/progressed. Considering the similar performance of the random forest and decision tree, model preference should be determined by the interpretability required to assist chart review and the ease of implementation into existing architecture.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估从化疗数据中回顾性检测卵巢癌复发的机器学习方法
背景世界各地的电子健康记录中对癌症复发的记录很少。这阻碍了对癌症治疗效果的研究。目前,复发/进展诊断日期的回顾性识别是由工作人员手动查看患者的健康记录来实现的。这种方法成本高、耗时长、效率低。材料与方法本文评估了四种机器学习模型(随机森林、条件推理树、决策树和逻辑回归)在识别利兹教学医院信托基金 531 名患者化疗数据中上皮性卵巢癌复发/进展替代日期方面的应用。随机森林模型和决策树模型的分类结果与图表显示的下次治疗时间非常吻合,可作为无复发生存期的替代指标。此外,在识别癌症复发/进展患者时,所有模型的 F1 分数都达到了 0.940。考虑到随机森林和决策树的性能相似,应根据协助病历审查所需的可解释性以及在现有架构中实施的难易程度来决定是否选择模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Utility of automated data transfer for cancer clinical trials and considerations for implementation Characterisation of oncology EHR-derived real-world data in the UK, Germany, and Japan Evolving treatment patterns and outcomes among patients with metastatic urothelial carcinoma post-avelumab maintenance approval: insights from The US Oncology Network Collaborating across sectors in service of open science, precision oncology, and patients: an overview of the AACR Project GENIE (Genomics Evidence Neoplasia Information Exchange) Biopharma Collaborative (BPC) Data analytics for real-world data integration in TKI-treated NSCLC patients using electronic health records
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1