Integrating transcriptomics and proteomics to analyze the immune microenvironment of cytomegalovirus associated ulcerative colitis and identify relevant biomarkers.
Yang Chen, Qingqing Zheng, Hui Wang, Peiren Tang, Li Deng, Pu Li, Huan Li, Jianhong Hou, Jie Li, Li Wang, Jun Peng
{"title":"Integrating transcriptomics and proteomics to analyze the immune microenvironment of cytomegalovirus associated ulcerative colitis and identify relevant biomarkers.","authors":"Yang Chen, Qingqing Zheng, Hui Wang, Peiren Tang, Li Deng, Pu Li, Huan Li, Jianhong Hou, Jie Li, Li Wang, Jun Peng","doi":"10.1186/s13040-024-00382-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In recent years, significant morbidity and mortality in patients with severe inflammatory bowel disease (IBD) and cytomegalovirus (CMV) have drawn considerable attention to the status of CMV infection in the intestinal mucosa of IBD patients and its role in disease progression. However, there is currently no high-throughput sequencing data for ulcerative colitis patients with CMV infection (CMV + UC), and the immune microenvironment in CMV + UC patients have yet to be explored.</p><p><strong>Method: </strong>The xCell algorithm was used for evaluate the immune microenvironment of CMV + UC patients. Then, WGCNA analysis was explored to obtain the co-expression modules between abnormal immune cells and gene level or protein level. Next, three machine learning approach include Random Forest, SVM-rfe, and Lasso were used to filter candidate biomarkers. Finally, Best Subset Selection algorithms was performed to construct the diagnostic model.</p><p><strong>Results: </strong>In this study, we performed transcriptomic and proteomic sequencing on CMV + UC patients to establish a comprehensive immune microenvironment profile and found 11 specific abnormal immune cells in CMV + UC group. After using multi-omics integration algorithms, we identified seven co-expression gene modules and five co-expression protein modules. Subsequently, we utilized various machine learning algorithms to identify key biomarkers with diagnostic efficacy and constructed an early diagnostic model. We identified a total of eight biomarkers (PPP1R12B, CIRBP, CSNK2A2, DNAJB11, PIK3R4, RRBP1, STX5, TMEM214) that play crucial roles in the immune microenvironment of CMV + UC and exhibit superior diagnostic performance for CMV + UC.</p><p><strong>Conclusion: </strong>This 8 biomarkers model offers a new paradigm for the diagnosis and treatment of IBD patients post-CMV infection. Further research into this model will be significant for understanding the changes in the host immune microenvironment following CMV infection.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"26"},"PeriodicalIF":4.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11348729/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00382-0","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: In recent years, significant morbidity and mortality in patients with severe inflammatory bowel disease (IBD) and cytomegalovirus (CMV) have drawn considerable attention to the status of CMV infection in the intestinal mucosa of IBD patients and its role in disease progression. However, there is currently no high-throughput sequencing data for ulcerative colitis patients with CMV infection (CMV + UC), and the immune microenvironment in CMV + UC patients have yet to be explored.
Method: The xCell algorithm was used for evaluate the immune microenvironment of CMV + UC patients. Then, WGCNA analysis was explored to obtain the co-expression modules between abnormal immune cells and gene level or protein level. Next, three machine learning approach include Random Forest, SVM-rfe, and Lasso were used to filter candidate biomarkers. Finally, Best Subset Selection algorithms was performed to construct the diagnostic model.
Results: In this study, we performed transcriptomic and proteomic sequencing on CMV + UC patients to establish a comprehensive immune microenvironment profile and found 11 specific abnormal immune cells in CMV + UC group. After using multi-omics integration algorithms, we identified seven co-expression gene modules and five co-expression protein modules. Subsequently, we utilized various machine learning algorithms to identify key biomarkers with diagnostic efficacy and constructed an early diagnostic model. We identified a total of eight biomarkers (PPP1R12B, CIRBP, CSNK2A2, DNAJB11, PIK3R4, RRBP1, STX5, TMEM214) that play crucial roles in the immune microenvironment of CMV + UC and exhibit superior diagnostic performance for CMV + UC.
Conclusion: This 8 biomarkers model offers a new paradigm for the diagnosis and treatment of IBD patients post-CMV infection. Further research into this model will be significant for understanding the changes in the host immune microenvironment following CMV infection.
期刊介绍:
BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data.
Topical areas include, but are not limited to:
-Development, evaluation, and application of novel data mining and machine learning algorithms.
-Adaptation, evaluation, and application of traditional data mining and machine learning algorithms.
-Open-source software for the application of data mining and machine learning algorithms.
-Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies.
-Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.