{"title":"使用集合技术的机器学习预测前列腺癌的总生存期","authors":"Declan Ikechukwu Emegano , Mubarak Taiwo Mustapha , Dilber Uzun Ozsahin , Ilker Ozsahin","doi":"10.1016/j.compbiomed.2025.110008","DOIUrl":null,"url":null,"abstract":"<div><div>Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients.</div><div>We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"189 ","pages":"Article 110008"},"PeriodicalIF":6.3000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning prediction of overall survival in prostate adenocarcinoma using ensemble techniques\",\"authors\":\"Declan Ikechukwu Emegano , Mubarak Taiwo Mustapha , Dilber Uzun Ozsahin , Ilker Ozsahin\",\"doi\":\"10.1016/j.compbiomed.2025.110008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients.</div><div>We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"189 \",\"pages\":\"Article 110008\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525003592\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525003592","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
Machine learning prediction of overall survival in prostate adenocarcinoma using ensemble techniques
Prostate adenocarcinoma (PAC) is a complex and common cancer in males and is one of the leading causes of cancer-related death globally. PAC is a multifaceted disease that encompasses different subtypes, including acinar and ductal adenocarcinoma, small cell carcinoma, neuroendocrine tumors, and transitional cell carcinoma with each subtype presenting distinct prognostic difficulties. Therefore, predicting the overall survival (OS) rate of individuals with PAC continues to be a substantial clinical barrier due to the diverse nature of the illness, coexisting medical conditions, and constraints associated with conventional diagnostic markers. As a result, we focus on using ensemble machine learning (ML) models to predict the OS of PAC patients.
We evaluated these eight (8) ensemble ML models: Random Forest (RF), AdaBoost, Gradient Boosting (GB), Extreme Gradient Boosting (XGB), LightGBM (LGBM), CatBoost, Hard Voting Classifier (HVC), and Support Vector Classifier (SVC), using the data set obtained from the Cancer Genome Atlas (TCGA) PanCancer Atlas. The ensemble ML models were evaluated using essential performance indicators, such as accuracy, precision, recall, F-1 score, and ROC AUC score. The results show that GB outperformed other models by obtaining a perfect score of 1.0 in accuracy, precision, recall, and F-1 score, and 0.99 as ROC AUC. Similarly, RF and AdaBoost exhibited robust efficiency, suggesting their potential in healthcare settings for predicting PAC survival. In conclusion, the study highlights the importance of ensemble techniques in improving prediction precision and underscores the need for further research in clinical settings.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.