构建和评估基于机器学习的肝癌风险预测模型。

IF 2.5 4区医学 Q2 GASTROENTEROLOGY & HEPATOLOGY World Journal of Gastrointestinal Oncology Pub Date : 2024-09-15 DOI:10.4251/wjgo.v16.i9.3839

Ying-Ying Wang, Wan-Xia Yang, Qia-Jun Du, Zhen-Hua Liu, Ming-Hua Lu, Chong-Ge You

{"title":"构建和评估基于机器学习的肝癌风险预测模型。","authors":"Ying-Ying Wang, Wan-Xia Yang, Qia-Jun Du, Zhen-Hua Liu, Ming-Hua Lu, Chong-Ge You","doi":"10.4251/wjgo.v16.i9.3839","DOIUrl":null,"url":null,"abstract":"Background: Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.Aim: To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.Methods: In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.Results: Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.Conclusion: The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.","PeriodicalId":23762,"journal":{"name":"World Journal of Gastrointestinal Oncology","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438789/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction and evaluation of a liver cancer risk prediction model based on machine learning.\",\"authors\":\"Ying-Ying Wang, Wan-Xia Yang, Qia-Jun Du, Zhen-Hua Liu, Ming-Hua Lu, Chong-Ge You\",\"doi\":\"10.4251/wjgo.v16.i9.3839\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.Aim: To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.Methods: In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.Results: Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.Conclusion: The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.\",\"PeriodicalId\":23762,\"journal\":{\"name\":\"World Journal of Gastrointestinal Oncology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11438789/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Journal of Gastrointestinal Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.4251/wjgo.v16.i9.3839\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Journal of Gastrointestinal Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4251/wjgo.v16.i9.3839","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

背景：肝癌是全球发病率最高的恶性肿瘤之一，早期发现和治疗对提高患者的生存率和生活质量至关重要。然而，肝癌的早期症状往往并不明显，导致许多患者被诊断为晚期，从而大大降低了治疗效果。目的：通过机器学习技术开发肝癌风险预测模型，并对其性能进行评估：本研究共招募了 550 名患者，其中 190 名肝细胞癌（HCC）患者和 195 名肝硬化患者为训练队列，83 名肝细胞癌患者和 82 名肝硬化患者为验证队列。在训练队列中开发了逻辑回归（LR）、支持向量机（SVM）、随机森林（RF）和最小绝对收缩和选择算子（LASSO）回归模型。在验证队列中对模型性能进行了评估。此外，本研究还使用接收器工作特征曲线、校准曲线和决策曲线分析（DCA）对 ASAP 模型和本研究开发的模型的诊断效果进行了比较评估，以确定评估肝癌风险的最佳预测模型：利用年龄、白细胞、红细胞、血小板计数、甲胎蛋白和维生素 K 缺乏或拮抗剂 II 水平诱导的蛋白质等六个变量建立了 LR、SVM、RF 和 LASSO 回归模型。RF 模型表现出更高的区分度，训练集和验证集的曲线下面积分别为 0.969 和 0.858。这些值明显超过了 LR 模型（0.850 和 0.827）、SVM 模型（0.860 和 0.803）、LASSO 回归模型（0.845 和 0.831）和 ASAP 模型（0.866 和 0.813）。此外，校准和 DCA 表明 RF 模型具有稳健的校准和临床有效性：RF模型对HCC具有出色的预测能力，有助于临床实践中对HCC的早期诊断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Construction and evaluation of a liver cancer risk prediction model based on machine learning.

Background: Liver cancer is one of the most prevalent malignant tumors worldwide, and its early detection and treatment are crucial for enhancing patient survival rates and quality of life. However, the early symptoms of liver cancer are often not obvious, resulting in a late-stage diagnosis in many patients, which significantly reduces the effectiveness of treatment. Developing a highly targeted, widely applicable, and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.

Aim: To develop a liver cancer risk prediction model by employing machine learning techniques, and subsequently assess its performance.

Methods: In this study, a total of 550 patients were enrolled, with 190 hepatocellular carcinoma (HCC) and 195 cirrhosis patients serving as the training cohort, and 83 HCC and 82 cirrhosis patients forming the validation cohort. Logistic regression (LR), support vector machine (SVM), random forest (RF), and least absolute shrinkage and selection operator (LASSO) regression models were developed in the training cohort. Model performance was assessed in the validation cohort. Additionally, this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve, calibration curve, and decision curve analysis (DCA) to determine the optimal predictive model for assessing liver cancer risk.

Results: Six variables including age, white blood cell, red blood cell, platelet counts, alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR, SVM, RF, and LASSO regression models. The RF model exhibited superior discrimination, and the area under curve of the training and validation sets was 0.969 and 0.858, respectively. These values significantly surpassed those of the LR (0.850 and 0.827), SVM (0.860 and 0.803), LASSO regression (0.845 and 0.831), and ASAP (0.866 and 0.813) models. Furthermore, calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.

Conclusion: The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

World Journal of Gastrointestinal Oncology Medicine-Gastroenterology

CiteScore

4.20

自引率

3.30%

发文量

1082

期刊介绍： The World Journal of Gastrointestinal Oncology (WJGO) is a leading academic journal devoted to reporting the latest, cutting-edge research progress and findings of basic research and clinical practice in the field of gastrointestinal oncology.