Dengqing Si, Yu Shu, Hongbo Jiang, Xueping Lin, Qiurong Yuan, Shaotuan Deng, Wei Luo, Yangze Lin, Ju Wang, Chengxiong Zhan, Aasma Shaukat, Peter C Ambe, Shiqiong Niu, Zhaofan Luo
{"title":"Construction of diagnostic models with machine-learning algorithms for colorectal cancer based on clinical laboratory parameters.","authors":"Dengqing Si, Yu Shu, Hongbo Jiang, Xueping Lin, Qiurong Yuan, Shaotuan Deng, Wei Luo, Yangze Lin, Ju Wang, Chengxiong Zhan, Aasma Shaukat, Peter C Ambe, Shiqiong Niu, Zhaofan Luo","doi":"10.21037/jgo-24-516","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Colonoscopy remains the predominant diagnostic modality for colorectal cancer (CRC), as the diagnostic performance of tumor markers in alone, particularly in the early stages of the disease, is limited. This study sought to develop a diagnostic model for CRC that integrated various laboratory parameters.</p><p><strong>Methods: </strong>One hundred patients with CRC were assigned to an experimental group while 114 with benign colorectal diseases and 101 healthy individuals were assigned to a control group. The clinical and laboratory data, including the tumor markers such as carcinoembryonic antigen (CEA), glycan carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 242 (CA242), blood count parameters, blood biochemical parameters, and coagulation parameters, were collected for each participant. Three machine-learning models [multilayered perceptron (MLP), eXtreme Gradient Boosting (XGBoost), and random forest (RF)] were used to construct CRC diagnostic models. The performance of each model was evaluated based on its area under the curve (AUC), sensitivity, and specificity.</p><p><strong>Results: </strong>There are 12 parameters: including CEA, CA19-9, CA242, absolute neutrophil value (NEUT), hemoglobin, the neutrophil/lymphocyte ratio, the platelet/lymphocyte ratio, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, albumin, and prothrombin time, were selected to build the diagnostic model. For the validation set, the RF machine-learning model achieved the highest performance in identifying CRC [AUC: 0.902 (95% confidence interval: 0.812-0.989), accuracy: 0.803, sensitivity: 0.908, specificity: 0.772, positive predictive value: 0.664, negative predictive value: 0.890, and F1 score: 0.763]. The AUC, sensitivity, specificity, and Youden's index for the combined diagnosis of tumor markers CEA, CA19-9, and CA242 were 0.761, 0.486, 0.983, and 0.469, respectively. The RF diagnostic model showed better diagnostic efficacy than the combined diagnosis model of tumor markers CEA, CA19-9 and CA242.</p><p><strong>Conclusions: </strong>The use of machine learning combined with multiple laboratory parameters effectively improved the diagnostic efficiency of CRC and provided more accurate results for clinical diagnosis.</p>","PeriodicalId":15841,"journal":{"name":"Journal of gastrointestinal oncology","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11565110/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of gastrointestinal oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/jgo-24-516","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/12 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Colonoscopy remains the predominant diagnostic modality for colorectal cancer (CRC), as the diagnostic performance of tumor markers in alone, particularly in the early stages of the disease, is limited. This study sought to develop a diagnostic model for CRC that integrated various laboratory parameters.
Methods: One hundred patients with CRC were assigned to an experimental group while 114 with benign colorectal diseases and 101 healthy individuals were assigned to a control group. The clinical and laboratory data, including the tumor markers such as carcinoembryonic antigen (CEA), glycan carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 242 (CA242), blood count parameters, blood biochemical parameters, and coagulation parameters, were collected for each participant. Three machine-learning models [multilayered perceptron (MLP), eXtreme Gradient Boosting (XGBoost), and random forest (RF)] were used to construct CRC diagnostic models. The performance of each model was evaluated based on its area under the curve (AUC), sensitivity, and specificity.
Results: There are 12 parameters: including CEA, CA19-9, CA242, absolute neutrophil value (NEUT), hemoglobin, the neutrophil/lymphocyte ratio, the platelet/lymphocyte ratio, alanine aminotransferase, alkaline phosphatase, aspartate aminotransferase, albumin, and prothrombin time, were selected to build the diagnostic model. For the validation set, the RF machine-learning model achieved the highest performance in identifying CRC [AUC: 0.902 (95% confidence interval: 0.812-0.989), accuracy: 0.803, sensitivity: 0.908, specificity: 0.772, positive predictive value: 0.664, negative predictive value: 0.890, and F1 score: 0.763]. The AUC, sensitivity, specificity, and Youden's index for the combined diagnosis of tumor markers CEA, CA19-9, and CA242 were 0.761, 0.486, 0.983, and 0.469, respectively. The RF diagnostic model showed better diagnostic efficacy than the combined diagnosis model of tumor markers CEA, CA19-9 and CA242.
Conclusions: The use of machine learning combined with multiple laboratory parameters effectively improved the diagnostic efficiency of CRC and provided more accurate results for clinical diagnosis.
期刊介绍:
ournal of Gastrointestinal Oncology (Print ISSN 2078-6891; Online ISSN 2219-679X; J Gastrointest Oncol; JGO), the official journal of Society for Gastrointestinal Oncology (SGO), is an open-access, international peer-reviewed journal. It is published quarterly (Sep. 2010- Dec. 2013), bimonthly (Feb. 2014 -) and openly distributed worldwide.
JGO publishes manuscripts that focus on updated and practical information about diagnosis, prevention and clinical investigations of gastrointestinal cancer treatment. Specific areas of interest include, but not limited to, multimodality therapy, markers, imaging and tumor biology.