重采样数据解决类别不平衡问题(IRCIP)的意义:医疗数据分类算法对性能影响的评估

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES JAMIA Open Pub Date : 2023-07-01 DOI:10.1093/jamiaopen/ooad033
Koen Welvaars, Jacobien H F Oosterhoff, Michel P J van den Bekerom, Job N Doornberg, Ernst P van Haarst
{"title":"重采样数据解决类别不平衡问题(IRCIP)的意义:医疗数据分类算法对性能影响的评估","authors":"Koen Welvaars,&nbsp;Jacobien H F Oosterhoff,&nbsp;Michel P J van den Bekerom,&nbsp;Job N Doornberg,&nbsp;Ernst P van Haarst","doi":"10.1093/jamiaopen/ooad033","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>When correcting for the \"class imbalance\" problem in medical data, the effects of resampling applied on classifier algorithms remain unclear. We examined the effect on performance over several combinations of classifiers and resampling ratios.</p><p><strong>Materials and methods: </strong>Multiple classification algorithms were trained on 7 resampled datasets: no correction, random undersampling, 4 ratios of Synthetic Minority Oversampling Technique (SMOTE), and random oversampling with the Adaptive Synthetic algorithm (ADASYN). Performance was evaluated in Area Under the Curve (AUC), precision, recall, Brier score, and calibration metrics. A case study on prediction modeling for 30-day unplanned readmissions in previously admitted Urology patients was presented.</p><p><strong>Results: </strong>For most algorithms, using resampled data showed a significant increase in AUC and precision, ranging from 0.74 (CI: 0.69-0.79) to 0.93 (CI: 0.92-0.94), and 0.35 (CI: 0.12-0.58) to 0.86 (CI: 0.81-0.92) respectively. All classification algorithms showed significant increases in recall, and significant decreases in Brier score with distorted calibration overestimating positives.</p><p><strong>Discussion: </strong>Imbalance correction resulted in an overall improved performance, yet poorly calibrated models. There can still be clinical utility due to a strong discriminating performance, specifically when predicting only low and high risk cases is clinically more relevant.</p><p><strong>Conclusion: </strong>Resampling data resulted in increased performances in classification algorithms, yet produced an overestimation of positive predictions. Based on the findings from our case study, a thoughtful predefinition of the clinical prediction task may guide the use of resampling techniques in future studies aiming to improve clinical decision support tools.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/bb/be/ooad033.PMC10232287.pdf","citationCount":"1","resultStr":"{\"title\":\"Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data.\",\"authors\":\"Koen Welvaars,&nbsp;Jacobien H F Oosterhoff,&nbsp;Michel P J van den Bekerom,&nbsp;Job N Doornberg,&nbsp;Ernst P van Haarst\",\"doi\":\"10.1093/jamiaopen/ooad033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>When correcting for the \\\"class imbalance\\\" problem in medical data, the effects of resampling applied on classifier algorithms remain unclear. We examined the effect on performance over several combinations of classifiers and resampling ratios.</p><p><strong>Materials and methods: </strong>Multiple classification algorithms were trained on 7 resampled datasets: no correction, random undersampling, 4 ratios of Synthetic Minority Oversampling Technique (SMOTE), and random oversampling with the Adaptive Synthetic algorithm (ADASYN). Performance was evaluated in Area Under the Curve (AUC), precision, recall, Brier score, and calibration metrics. A case study on prediction modeling for 30-day unplanned readmissions in previously admitted Urology patients was presented.</p><p><strong>Results: </strong>For most algorithms, using resampled data showed a significant increase in AUC and precision, ranging from 0.74 (CI: 0.69-0.79) to 0.93 (CI: 0.92-0.94), and 0.35 (CI: 0.12-0.58) to 0.86 (CI: 0.81-0.92) respectively. All classification algorithms showed significant increases in recall, and significant decreases in Brier score with distorted calibration overestimating positives.</p><p><strong>Discussion: </strong>Imbalance correction resulted in an overall improved performance, yet poorly calibrated models. There can still be clinical utility due to a strong discriminating performance, specifically when predicting only low and high risk cases is clinically more relevant.</p><p><strong>Conclusion: </strong>Resampling data resulted in increased performances in classification algorithms, yet produced an overestimation of positive predictions. Based on the findings from our case study, a thoughtful predefinition of the clinical prediction task may guide the use of resampling techniques in future studies aiming to improve clinical decision support tools.</p>\",\"PeriodicalId\":36278,\"journal\":{\"name\":\"JAMIA Open\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/bb/be/ooad033.PMC10232287.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMIA Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamiaopen/ooad033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooad033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 1

摘要

目的:在纠正医疗数据中的“类失衡”问题时,重采样对分类器算法的影响尚不清楚。我们检查了分类器和重采样比率的几种组合对性能的影响。材料和方法:在7个重采样数据集上训练多种分类算法:不校正、随机欠采样、4比合成少数过采样技术(SMOTE)和随机过采样自适应合成算法(ADASYN)。通过曲线下面积(AUC)、精密度、召回率、Brier评分和校准指标来评估其性能。本文介绍了一个泌尿外科患者30天非计划再入院预测模型的案例研究。结果:对于大多数算法,使用重采样数据显示AUC和精度显著增加,分别在0.74 (CI: 0.69-0.79)至0.93 (CI: 0.92-0.94)和0.35 (CI: 0.12-0.58)至0.86 (CI: 0.81-0.92)之间。所有分类算法均显示召回率显著提高,而失真校准高估阳性的Brier评分显著降低。讨论:不平衡校正导致了整体性能的提高,但是模型校准不好。由于具有很强的区分性能,特别是当仅预测低风险和高风险病例在临床上更相关时,仍然可能具有临床效用。结论:重采样数据提高了分类算法的性能,但产生了对积极预测的高估。基于我们案例研究的发现,对临床预测任务进行深思熟虑的预先定义可能会指导在未来的研究中使用重采样技术,以改进临床决策支持工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data.

Objective: When correcting for the "class imbalance" problem in medical data, the effects of resampling applied on classifier algorithms remain unclear. We examined the effect on performance over several combinations of classifiers and resampling ratios.

Materials and methods: Multiple classification algorithms were trained on 7 resampled datasets: no correction, random undersampling, 4 ratios of Synthetic Minority Oversampling Technique (SMOTE), and random oversampling with the Adaptive Synthetic algorithm (ADASYN). Performance was evaluated in Area Under the Curve (AUC), precision, recall, Brier score, and calibration metrics. A case study on prediction modeling for 30-day unplanned readmissions in previously admitted Urology patients was presented.

Results: For most algorithms, using resampled data showed a significant increase in AUC and precision, ranging from 0.74 (CI: 0.69-0.79) to 0.93 (CI: 0.92-0.94), and 0.35 (CI: 0.12-0.58) to 0.86 (CI: 0.81-0.92) respectively. All classification algorithms showed significant increases in recall, and significant decreases in Brier score with distorted calibration overestimating positives.

Discussion: Imbalance correction resulted in an overall improved performance, yet poorly calibrated models. There can still be clinical utility due to a strong discriminating performance, specifically when predicting only low and high risk cases is clinically more relevant.

Conclusion: Resampling data resulted in increased performances in classification algorithms, yet produced an overestimation of positive predictions. Based on the findings from our case study, a thoughtful predefinition of the clinical prediction task may guide the use of resampling techniques in future studies aiming to improve clinical decision support tools.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JAMIA Open
JAMIA Open Medicine-Health Informatics
CiteScore
4.10
自引率
4.80%
发文量
102
审稿时长
16 weeks
期刊最新文献
A landmark federal interagency collaboration to promote data science in health care: Million Veteran Program-Computational Health Analytics for Medical Precision to Improve Outcomes Now. Targetable molecular algorithm and training platform development for the treatment of non-small cell lung cancer. Sex, sexual orientation, and gender identity data collection across electronic health record platforms: a national cross-sectional survey. Assessing the use of unstructured electronic health record data to identify exposure to firearm violence. Developing personas to inform the design of digital interventions for perinatal mental health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1