{"title":"Random forests algorithm using basic medical data for predicting the presence of colonic polyps.","authors":"Mihaela-Flavia Avram, Nicolae Lupa, Dimitrios Koukoulas, Daniela-Cornelia Lazăr, Mihaela-Ioana Mariș, Marius-Sorin Murariu, Sorin Olariu","doi":"10.3389/fsurg.2025.1523684","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Colorectal cancer is considered to be triggered by the malignant transformation of colorectal polyps. Early diagnosis and excision of colorectal polyps has been found to lower the mortality and morbidity associated with colorectal cancer.</p><p><strong>Objective: </strong>The aim of this study is to offer a predictive model for the presence of colorectal polyps based on Random Forests machine learning algorithm, using basic patient information and common laboratory test results.</p><p><strong>Materials and methods: </strong>164 patients were included in the study. The following data was collected: sex, residence, age, diabetes mellitus, body mass index, fasting blood glucose levels, hemoglobin, platelets, total, LDL and HLD cholesterol, triglycerides, serum glutamic-oxaloacetic transaminase, chronic gastritis, presence of colonic polyps at colonoscopy. 80% of patients were included in the training set for creating a Random forests algorithm, 20% were in the test set. External validation was performed on data from 42 patients. The performance of the Random Forests was compared with the performance of a generalized linear model (GLM) and support vector machine (SVM) built and tested on the same datasets.</p><p><strong>Results: </strong>The Random Forest prediction model gave an AUC of 0.820 on the test set. The top five variables in order of importance were: body mass index, platelets, hemoglobin, triglycerides, glutamic-oxaloacetic transaminase. For external validation, the AUC was 0.79. GLM performance in internal validation was an AUC of 0.788, while for external validation AUC-0.65. For SVN, the AUC - 0.785 for internal validation and 0.685 for the external validation dataset.</p><p><strong>Conclusions: </strong>A random forest prediction model was developed using patient's demographic data, medical history and common blood tests results. This algorithm can foresee, with good predictive power, the presence of colonic polyps.</p>","PeriodicalId":12564,"journal":{"name":"Frontiers in Surgery","volume":"12 ","pages":"1523684"},"PeriodicalIF":1.6000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11911476/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fsurg.2025.1523684","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Colorectal cancer is considered to be triggered by the malignant transformation of colorectal polyps. Early diagnosis and excision of colorectal polyps has been found to lower the mortality and morbidity associated with colorectal cancer.
Objective: The aim of this study is to offer a predictive model for the presence of colorectal polyps based on Random Forests machine learning algorithm, using basic patient information and common laboratory test results.
Materials and methods: 164 patients were included in the study. The following data was collected: sex, residence, age, diabetes mellitus, body mass index, fasting blood glucose levels, hemoglobin, platelets, total, LDL and HLD cholesterol, triglycerides, serum glutamic-oxaloacetic transaminase, chronic gastritis, presence of colonic polyps at colonoscopy. 80% of patients were included in the training set for creating a Random forests algorithm, 20% were in the test set. External validation was performed on data from 42 patients. The performance of the Random Forests was compared with the performance of a generalized linear model (GLM) and support vector machine (SVM) built and tested on the same datasets.
Results: The Random Forest prediction model gave an AUC of 0.820 on the test set. The top five variables in order of importance were: body mass index, platelets, hemoglobin, triglycerides, glutamic-oxaloacetic transaminase. For external validation, the AUC was 0.79. GLM performance in internal validation was an AUC of 0.788, while for external validation AUC-0.65. For SVN, the AUC - 0.785 for internal validation and 0.685 for the external validation dataset.
Conclusions: A random forest prediction model was developed using patient's demographic data, medical history and common blood tests results. This algorithm can foresee, with good predictive power, the presence of colonic polyps.
期刊介绍:
Evidence of surgical interventions go back to prehistoric times. Since then, the field of surgery has developed into a complex array of specialties and procedures, particularly with the advent of microsurgery, lasers and minimally invasive techniques. The advanced skills now required from surgeons has led to ever increasing specialization, though these still share important fundamental principles.
Frontiers in Surgery is the umbrella journal representing the publication interests of all surgical specialties. It is divided into several “Specialty Sections” listed below. All these sections have their own Specialty Chief Editor, Editorial Board and homepage, but all articles carry the citation Frontiers in Surgery.
Frontiers in Surgery calls upon medical professionals and scientists from all surgical specialties to publish their experimental and clinical studies in this journal. By assembling all surgical specialties, which nonetheless retain their independence, under the common umbrella of Frontiers in Surgery, a powerful publication venue is created. Since there is often overlap and common ground between the different surgical specialties, assembly of all surgical disciplines into a single journal will foster a collaborative dialogue amongst the surgical community. This means that publications, which are also of interest to other surgical specialties, will reach a wider audience and have greater impact.
The aim of this multidisciplinary journal is to create a discussion and knowledge platform of advances and research findings in surgical practice today to continuously improve clinical management of patients and foster innovation in this field.