{"title":"Prediction of COVID-19 Based on Genomic Biomarkers of Metagenomic Next-Generation Sequencing Data Using Artificial Intelligence Technology","authors":"S. Akbulut","doi":"10.14744/etd.2022.00868","DOIUrl":null,"url":null,"abstract":"Objective: The primary aim of this study was to use metagenomic next-generation sequencing (mNGS) data to identify coronavirus 2019 (COVID-19)-related biomarker genes and to construct a machine learning model that could successfully differentiate patients with COVID-19 from healthy controls. Materials and Methods: The mNGS dataset used in the study demonstrated expression of 15,979 genes in the upper airway in 234 patients who were COVID-19 negative and COVID-19 positive. The Boruta method was used to select qualitative biomarker genes associated with COVID-19. Random forest (RF), gradient boosting tree (GBT), and multi-layer perceptron (MLP) models were used to predict COVID-19 based on the selected biomarker genes. Results: The MLP (0.936) model outperformed the GBT (0.851), and RF (0.809) models in predicting COVID-19. The three most important biomarker candidate genes associated with COVID-19 were IFI27, TPTI, and FAM83A. Conclusion: The proposed model (MLP) was able to predict COVID-19 successfully. The results showed that the generated model and selected biomarker candidate genes can be used as diagnostic models for clinical testing or potential therapeutic targets and vaccine design.","PeriodicalId":43995,"journal":{"name":"Erciyes Medical Journal","volume":"36 1","pages":""},"PeriodicalIF":0.3000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Erciyes Medical Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14744/etd.2022.00868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 2
Abstract
Objective: The primary aim of this study was to use metagenomic next-generation sequencing (mNGS) data to identify coronavirus 2019 (COVID-19)-related biomarker genes and to construct a machine learning model that could successfully differentiate patients with COVID-19 from healthy controls. Materials and Methods: The mNGS dataset used in the study demonstrated expression of 15,979 genes in the upper airway in 234 patients who were COVID-19 negative and COVID-19 positive. The Boruta method was used to select qualitative biomarker genes associated with COVID-19. Random forest (RF), gradient boosting tree (GBT), and multi-layer perceptron (MLP) models were used to predict COVID-19 based on the selected biomarker genes. Results: The MLP (0.936) model outperformed the GBT (0.851), and RF (0.809) models in predicting COVID-19. The three most important biomarker candidate genes associated with COVID-19 were IFI27, TPTI, and FAM83A. Conclusion: The proposed model (MLP) was able to predict COVID-19 successfully. The results showed that the generated model and selected biomarker candidate genes can be used as diagnostic models for clinical testing or potential therapeutic targets and vaccine design.
期刊介绍:
Erciyes Medical Journal (Erciyes Med J) is the international, peer-reviewed, open access publication of Erciyes University School of Medicine. The journal, which has been in continuous publication since 1978, is a publication published on March, June, September, and December. The publication language of the journal is English. The journal accepts clinical and experimental research articles in different fields of medicine, original case reports, letters to the editor and invited reviews for publication. Research articles and case reports on regionally frequent and specific medical topics are prioritized. Manuscripts on national and international scientific meetings and symposiums and manuscripts sharing scientific correspondence and scientific knowledge between authors and their readers are also published.