Jicheng Li, Tao Zhu, Lin Wang, Luxi Yang, Yulong Zhu, Rui Li, Yubo Li, Yongcong Chen, Lingqing Zhang
{"title":"基于机器学习的医疗纠纷预测模型及其临床应用效果研究。","authors":"Jicheng Li, Tao Zhu, Lin Wang, Luxi Yang, Yulong Zhu, Rui Li, Yubo Li, Yongcong Chen, Lingqing Zhang","doi":"10.1186/s12911-024-02674-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Medical dispute is a global public health issue, which has been garnering increasing attention. In this study, we used machine learning (ML) method to establish a dispute prediction model and explored the clinical-application efficiency of this model in effectively reducing the occurrence of medical disputes.</p><p><strong>Methods: </strong>Retrospective study of All disputes filed by Gansu Medical Mediation Committee from 2019 to 2021 and patients with the same hospital level as that of the dispute group and hospitalization year were randomly selected as the control group in 1:1 ratio. SPSS software was used for univariate feature selection of the 14 factors that may cause disputes, and factors with statistical differences were selected. The data were divided into training and test sets in a 7:3 ratio. Six ML models were selected, and Python was used to establish a dispute prediction model. The area under the curve (AUC) of the receiver operating characteristic curve (ROC), sensitivity, specificity, accuracy, precision, average precision (AP), and F1 score were used to characterize the fitting and accuracy of the models, while decision curve analysis (DCA) was used to evaluate their clinical utility.</p><p><strong>Results: </strong>A total of 1189 patients in the dispute and control groups were extracted. Following 11 influencing factors were selected: the inpatient department, doctor title, patient age, patient gender, patient occupation, payment method, hospitalization days, hospitalization times, discharge method, blood transfusion volume, and hospitalization espenses. Compared to other models, the AUC (0.945, 95% CI 0.913-0.981), Sensitivity (0.887), Accuracy (0.887), AP (0.834), and F1 score (0.880) of the random forest model were higher than those of other models, while the DCA curve indicated its high clinical benefits.</p><p><strong>Conclusions: </strong>Inpatient department, hospitalization expenses, and discharge type are the primary influencing factors of dispute. Random forest exhibited high dispute prediction and clinical-application value and is expected to be promoted for offline dispute prediction.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"280"},"PeriodicalIF":3.3000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11443660/pdf/","citationCount":"0","resultStr":"{\"title\":\"Study on medical dispute prediction model and its clinical-application effectiveness based on machine learning.\",\"authors\":\"Jicheng Li, Tao Zhu, Lin Wang, Luxi Yang, Yulong Zhu, Rui Li, Yubo Li, Yongcong Chen, Lingqing Zhang\",\"doi\":\"10.1186/s12911-024-02674-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Medical dispute is a global public health issue, which has been garnering increasing attention. In this study, we used machine learning (ML) method to establish a dispute prediction model and explored the clinical-application efficiency of this model in effectively reducing the occurrence of medical disputes.</p><p><strong>Methods: </strong>Retrospective study of All disputes filed by Gansu Medical Mediation Committee from 2019 to 2021 and patients with the same hospital level as that of the dispute group and hospitalization year were randomly selected as the control group in 1:1 ratio. SPSS software was used for univariate feature selection of the 14 factors that may cause disputes, and factors with statistical differences were selected. The data were divided into training and test sets in a 7:3 ratio. Six ML models were selected, and Python was used to establish a dispute prediction model. The area under the curve (AUC) of the receiver operating characteristic curve (ROC), sensitivity, specificity, accuracy, precision, average precision (AP), and F1 score were used to characterize the fitting and accuracy of the models, while decision curve analysis (DCA) was used to evaluate their clinical utility.</p><p><strong>Results: </strong>A total of 1189 patients in the dispute and control groups were extracted. Following 11 influencing factors were selected: the inpatient department, doctor title, patient age, patient gender, patient occupation, payment method, hospitalization days, hospitalization times, discharge method, blood transfusion volume, and hospitalization espenses. Compared to other models, the AUC (0.945, 95% CI 0.913-0.981), Sensitivity (0.887), Accuracy (0.887), AP (0.834), and F1 score (0.880) of the random forest model were higher than those of other models, while the DCA curve indicated its high clinical benefits.</p><p><strong>Conclusions: </strong>Inpatient department, hospitalization expenses, and discharge type are the primary influencing factors of dispute. Random forest exhibited high dispute prediction and clinical-application value and is expected to be promoted for offline dispute prediction.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"24 1\",\"pages\":\"280\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11443660/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02674-1\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02674-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
摘要
背景:医疗纠纷是一个全球性的公共卫生问题,日益受到人们的关注。本研究采用机器学习(ML)方法建立了纠纷预测模型,并探讨了该模型在有效减少医疗纠纷发生方面的临床应用效果:回顾性研究甘肃省医调委2019年至2021年立案的所有纠纷,按1:1比例随机抽取与纠纷组医院级别、住院年份相同的患者作为对照组。采用SPSS软件对可能引起纠纷的14个因素进行单变量特征选择,筛选出具有统计学差异的因素。数据按 7:3 的比例分为训练集和测试集。筛选出六个 ML 模型,并使用 Python 建立了争议预测模型。用接收者操作特征曲线(ROC)的曲线下面积(AUC)、灵敏度、特异性、准确度、精确度、平均精确度(AP)和 F1 分数来表征模型的拟合度和准确度,同时用决策曲线分析(DCA)来评估其临床实用性:共提取了争议组和对照组的 1189 名患者。结果:争议组和对照组共 1189 例患者,选取了住院科室、医生职称、患者年龄、患者性别、患者职业、付费方式、住院天数、住院时间、出院方式、输血量、住院时间等 11 个影响因素。与其他模型相比,随机森林模型的AUC(0.945,95% CI 0.913-0.981)、灵敏度(0.887)、准确度(0.887)、AP(0.834)和F1得分(0.880)均高于其他模型,而DCA曲线表明其具有较高的临床效益:结论:住院部门、住院费用和出院类型是争议的主要影响因素。随机森林模型具有较高的争议预测和临床应用价值,有望在离线争议预测中得到推广。
Study on medical dispute prediction model and its clinical-application effectiveness based on machine learning.
Background: Medical dispute is a global public health issue, which has been garnering increasing attention. In this study, we used machine learning (ML) method to establish a dispute prediction model and explored the clinical-application efficiency of this model in effectively reducing the occurrence of medical disputes.
Methods: Retrospective study of All disputes filed by Gansu Medical Mediation Committee from 2019 to 2021 and patients with the same hospital level as that of the dispute group and hospitalization year were randomly selected as the control group in 1:1 ratio. SPSS software was used for univariate feature selection of the 14 factors that may cause disputes, and factors with statistical differences were selected. The data were divided into training and test sets in a 7:3 ratio. Six ML models were selected, and Python was used to establish a dispute prediction model. The area under the curve (AUC) of the receiver operating characteristic curve (ROC), sensitivity, specificity, accuracy, precision, average precision (AP), and F1 score were used to characterize the fitting and accuracy of the models, while decision curve analysis (DCA) was used to evaluate their clinical utility.
Results: A total of 1189 patients in the dispute and control groups were extracted. Following 11 influencing factors were selected: the inpatient department, doctor title, patient age, patient gender, patient occupation, payment method, hospitalization days, hospitalization times, discharge method, blood transfusion volume, and hospitalization espenses. Compared to other models, the AUC (0.945, 95% CI 0.913-0.981), Sensitivity (0.887), Accuracy (0.887), AP (0.834), and F1 score (0.880) of the random forest model were higher than those of other models, while the DCA curve indicated its high clinical benefits.
Conclusions: Inpatient department, hospitalization expenses, and discharge type are the primary influencing factors of dispute. Random forest exhibited high dispute prediction and clinical-application value and is expected to be promoted for offline dispute prediction.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.