Yue Wu, Kai-Yuan Min, Jiang-Feng Liu, Wan-Feng Liang, Ye-Hong Yang, Gang Hu, Jun-Tao Yang
{"title":"[基于机器学习的乳腺浸润性癌蛋白编码基因标志物鉴定]。","authors":"Yue Wu, Kai-Yuan Min, Jiang-Feng Liu, Wan-Feng Liang, Ye-Hong Yang, Gang Hu, Jun-Tao Yang","doi":"10.3881/j.issn.1000-503X.15717","DOIUrl":null,"url":null,"abstract":"<p><p>Objective To screen out the biomarkers linked to prognosis of breast invasive carcinoma based on the analysis of transcriptome data by random forest (RF),extreme gradient boosting (XGBoost),light gradient boosting machine (LightGBM),and categorical boosting (CatBoost). Methods We obtained the expression data of breast invasive carcinoma from The Cancer Genome Atlas and employed DESeq2,<i>t</i>-test,and Cox univariate analysis to identify the differentially expressed protein-coding genes associated with survival prognosis in human breast invasive carcinoma samples.Furthermore,RF,XGBoost,LightGBM,and CatBoost models were established to mine the protein-coding gene markers related to the prognosis of breast invasive cancer and the model performance was compared.The expression data of breast cancer from the Gene Expression Omnibus was used for validation. Results A total of 151 differentially expressed protein-coding genes related to survival prognosis were screened out.The machine learning model established with C3orf80,UGP2,and SPC25 demonstrated the best performance. Conclusions Three protein-coding genes (UGP2,C3orf80,and SPC25) were screened out to identify breast invasive carcinoma.This study provides a new direction for the treatment and diagnosis of breast invasive carcinoma.</p>","PeriodicalId":6919,"journal":{"name":"中国医学科学院学报","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Identification of Protein-Coding Gene Markers in Breast Invasive Carcinoma Based on Machine Learning].\",\"authors\":\"Yue Wu, Kai-Yuan Min, Jiang-Feng Liu, Wan-Feng Liang, Ye-Hong Yang, Gang Hu, Jun-Tao Yang\",\"doi\":\"10.3881/j.issn.1000-503X.15717\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Objective To screen out the biomarkers linked to prognosis of breast invasive carcinoma based on the analysis of transcriptome data by random forest (RF),extreme gradient boosting (XGBoost),light gradient boosting machine (LightGBM),and categorical boosting (CatBoost). Methods We obtained the expression data of breast invasive carcinoma from The Cancer Genome Atlas and employed DESeq2,<i>t</i>-test,and Cox univariate analysis to identify the differentially expressed protein-coding genes associated with survival prognosis in human breast invasive carcinoma samples.Furthermore,RF,XGBoost,LightGBM,and CatBoost models were established to mine the protein-coding gene markers related to the prognosis of breast invasive cancer and the model performance was compared.The expression data of breast cancer from the Gene Expression Omnibus was used for validation. Results A total of 151 differentially expressed protein-coding genes related to survival prognosis were screened out.The machine learning model established with C3orf80,UGP2,and SPC25 demonstrated the best performance. Conclusions Three protein-coding genes (UGP2,C3orf80,and SPC25) were screened out to identify breast invasive carcinoma.This study provides a new direction for the treatment and diagnosis of breast invasive carcinoma.</p>\",\"PeriodicalId\":6919,\"journal\":{\"name\":\"中国医学科学院学报\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"中国医学科学院学报\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.3881/j.issn.1000-503X.15717\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"中国医学科学院学报","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.3881/j.issn.1000-503X.15717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
[Identification of Protein-Coding Gene Markers in Breast Invasive Carcinoma Based on Machine Learning].
Objective To screen out the biomarkers linked to prognosis of breast invasive carcinoma based on the analysis of transcriptome data by random forest (RF),extreme gradient boosting (XGBoost),light gradient boosting machine (LightGBM),and categorical boosting (CatBoost). Methods We obtained the expression data of breast invasive carcinoma from The Cancer Genome Atlas and employed DESeq2,t-test,and Cox univariate analysis to identify the differentially expressed protein-coding genes associated with survival prognosis in human breast invasive carcinoma samples.Furthermore,RF,XGBoost,LightGBM,and CatBoost models were established to mine the protein-coding gene markers related to the prognosis of breast invasive cancer and the model performance was compared.The expression data of breast cancer from the Gene Expression Omnibus was used for validation. Results A total of 151 differentially expressed protein-coding genes related to survival prognosis were screened out.The machine learning model established with C3orf80,UGP2,and SPC25 demonstrated the best performance. Conclusions Three protein-coding genes (UGP2,C3orf80,and SPC25) were screened out to identify breast invasive carcinoma.This study provides a new direction for the treatment and diagnosis of breast invasive carcinoma.
期刊介绍:
Acta Academiae Medicinae Sinicae was founded in February 1979. It is a comprehensive medical academic journal published in China and abroad, supervised by the Ministry of Health of the People's Republic of China and sponsored by the Chinese Academy of Medical Sciences and Peking Union Medical College.
The journal mainly reports the latest research results, work progress and dynamics in the fields of basic medicine, clinical medicine, pharmacy, preventive medicine, biomedicine, medical teaching and research, aiming to promote the exchange of medical information and improve the academic level of medicine. At present, the journal has been included in 10 famous foreign retrieval systems and their databases [Medline (PubMed online version), Elsevier, EMBASE, CA, WPRIM, ExtraMED, IC, JST, UPD and EBSCO-ASP]; and has been included in important domestic retrieval systems and databases [China Science Citation Database (Documentation and Information Center of the Chinese Academy of Sciences), China Core Journals Overview (Peking University Library), China Science and Technology Paper Statistical Source Database (China Science and Technology Core Journals) (China Institute of Scientific and Technological Information), China Science and Technology Journal Paper and Citation Database (China Institute of Scientific and Technological Information)].