Guoda Han , Xu Liu , Tian Gao , Lei Zhang , Xiaoling Zhang , Xiaonan Wei , Yecheng Lin , Bohong Yin
{"title":"Prognostic prediction of gastric cancer based on H&E findings and machine learning pathomics","authors":"Guoda Han , Xu Liu , Tian Gao , Lei Zhang , Xiaoling Zhang , Xiaonan Wei , Yecheng Lin , Bohong Yin","doi":"10.1016/j.mcp.2024.101983","DOIUrl":null,"url":null,"abstract":"<div><h3>Aim</h3><div>In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&E findings combined with machine learning pathomics.</div></div><div><h3>Methods</h3><div>Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.</div></div><div><h3>Results</h3><div>In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P < 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).</div></div><div><h3>Conclusion</h3><div>The pathomic model based on H&E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.</div></div>","PeriodicalId":49799,"journal":{"name":"Molecular and Cellular Probes","volume":"78 ","pages":"Article 101983"},"PeriodicalIF":2.3000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular and Cellular Probes","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0890850824000355","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Aim
In this research, we aimed to develop a model for the accurate prediction of gastric cancer based on H&E findings combined with machine learning pathomics.
Methods
Transcriptome data, pathological images, and clinical data from 443 cases were retrieved from TCGA (The Cancer Genome Atlas Program) for survival analysis. The images were segmented using the Otsu algorithm, and features were extracted using the PyRadiomics package. Subsequently, the cases were randomly divided into a training cohort of 165 cases and a validation cohort of 69 cases. Features selected via minimum Redundancy - Maximum Relevance (mRMR)- recursive feature elimination (RFE) screening were used to train a model using the Gradient Boosting Machine (GBM) algorithm. The model's performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), calibration curves, and decision curves. Additionally, the correlation between the Pathomics score (PS) and immune genes was examined.
Results
In the multivariate analysis, heightened infiltration of activated CD4 memory T cells was strongly associated with improved overall survival (HR = 0.505, 95 % CI = 0.342–0.745, P < 0.001). The pathomic model, exhibiting robust predictive capability, demonstrated impressive AUC values of 0.844 and 0.750 in both study cohorts. The Decision Curve Analysis (DCA) unequivocally underscored the model's exceptional clinical utility. In a subsequent multivariate analysis, heightened infiltration of the PS also emerged as a significant protective factor for overall survival (HR = 0.506, 95 % CI = 0.329–0.777, P = 0.002).
Conclusion
The pathomic model based on H&E slides for predicting the infiltration degree of activated CD4 memory T cells, along with integrated bioinformatics analysis elucidating potential molecular mechanisms, offers novel prognostic indicators for the precise stratification and individualized prognosis of gastric cancer patients.
期刊介绍:
MCP - Advancing biology through–omics and bioinformatic technologies wants to capture outcomes from the current revolution in molecular technologies and sciences. The journal has broadened its scope and embraces any high quality research papers, reviews and opinions in areas including, but not limited to, molecular biology, cell biology, biochemistry, immunology, physiology, epidemiology, ecology, virology, microbiology, parasitology, genetics, evolutionary biology, genomics (including metagenomics), bioinformatics, proteomics, metabolomics, glycomics, and lipidomics. Submissions with a technology-driven focus on understanding normal biological or disease processes as well as conceptual advances and paradigm shifts are particularly encouraged. The Editors welcome fundamental or applied research areas; pre-submission enquiries about advanced draft manuscripts are welcomed. Top quality research and manuscripts will be fast-tracked.