Nitin Jagtap, Rakesh Kalapala, Hardik Rughwani, Aniruddha Pratap Singh, Pradev Inavolu, Mohan Ramchandani, Sundeep Lakhtakia, P Manohar Reddy, Anuradha Sekaran, Manu Tandan, Zaheer Nabi, Jahangeer Basha, Rajesh Gupta, Sana Fathima Memon, G Venkat Rao, Prateek Sharma, D Nageshwar Reddy
{"title":"Application of machine-learning model to optimize colonic adenoma detection in India.","authors":"Nitin Jagtap, Rakesh Kalapala, Hardik Rughwani, Aniruddha Pratap Singh, Pradev Inavolu, Mohan Ramchandani, Sundeep Lakhtakia, P Manohar Reddy, Anuradha Sekaran, Manu Tandan, Zaheer Nabi, Jahangeer Basha, Rajesh Gupta, Sana Fathima Memon, G Venkat Rao, Prateek Sharma, D Nageshwar Reddy","doi":"10.1007/s12664-024-01530-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>There is limited data on the prevalence and risk factors of colonic adenoma from the Indian sub-continent. We aimed at developing a machine-learning model to optimize colonic adenoma detection in a prospective cohort.</p><p><strong>Methods: </strong>All consecutive adult patients undergoing diagnostic colonoscopy were enrolled between October 2020 and November 2022. Patients with a high risk of colonic adenoma were excluded. The predictive model was developed using the gradient-boosting machine (GBM)-learning method. The GBM model was optimized further by adjusting the learning rate and the number of trees and 10-fold cross-validation.</p><p><strong>Results: </strong>Total 10,320 patients (mean age 45.18 ± 14.82 years; 69% men) were included in the study. In the overall population, 1152 (11.2%) patients had at least one adenoma. In patients with age > 50 years, hospital-based adenoma prevalence was 19.5% (808/4144). The area under the receiver operating curve (AUC) (SD) of the logistic regression model was 72.55% (4.91), while the AUCs for deep learning, decision tree, random forest and gradient-boosted tree model were 76.25% (4.22%), 65.95% (4.01%), 79.38% (4.91%) and 84.76% (2.86%), respectively. After model optimization and cross-validation, the AUC of the gradient-boosted tree model has increased to 92.2% (1.1%).</p><p><strong>Conclusions: </strong>Machine-learning models may predict colorectal adenoma more accurately than logistic regression. A machine-learning model may help optimize the use of colonoscopy to prevent colorectal cancers.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov (ID: NCT04512729).</p>","PeriodicalId":13404,"journal":{"name":"Indian Journal of Gastroenterology","volume":" ","pages":"995-1001"},"PeriodicalIF":2.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indian Journal of Gastroenterology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12664-024-01530-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/17 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Aims: There is limited data on the prevalence and risk factors of colonic adenoma from the Indian sub-continent. We aimed at developing a machine-learning model to optimize colonic adenoma detection in a prospective cohort.
Methods: All consecutive adult patients undergoing diagnostic colonoscopy were enrolled between October 2020 and November 2022. Patients with a high risk of colonic adenoma were excluded. The predictive model was developed using the gradient-boosting machine (GBM)-learning method. The GBM model was optimized further by adjusting the learning rate and the number of trees and 10-fold cross-validation.
Results: Total 10,320 patients (mean age 45.18 ± 14.82 years; 69% men) were included in the study. In the overall population, 1152 (11.2%) patients had at least one adenoma. In patients with age > 50 years, hospital-based adenoma prevalence was 19.5% (808/4144). The area under the receiver operating curve (AUC) (SD) of the logistic regression model was 72.55% (4.91), while the AUCs for deep learning, decision tree, random forest and gradient-boosted tree model were 76.25% (4.22%), 65.95% (4.01%), 79.38% (4.91%) and 84.76% (2.86%), respectively. After model optimization and cross-validation, the AUC of the gradient-boosted tree model has increased to 92.2% (1.1%).
Conclusions: Machine-learning models may predict colorectal adenoma more accurately than logistic regression. A machine-learning model may help optimize the use of colonoscopy to prevent colorectal cancers.
期刊介绍:
The Indian Journal of Gastroenterology aims to help doctors everywhere practise better medicine and to influence the debate on gastroenterology. To achieve these aims, we publish original scientific studies, state-of -the-art special articles, reports and papers commenting on the clinical, scientific and public health factors affecting aspects of gastroenterology. We shall be delighted to receive articles for publication in all of these categories and letters commenting on the contents of the Journal or on issues of interest to our readers.