Prediction of COVID-19 severity using machine learning

IF 7.9 1区医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL Clinical and Translational Medicine Pub Date : 2024-10-06 DOI:10.1002/ctm2.70042

Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Lu Zhang, Andrew I Lumley, Pranay Shah, Muhammad Shoaib, Venkata Satagopam, Prashant Kumar Srivastava, Costanza Emanueli, Simona Greco, Alisia Madè, Teresa Padro, Pedro Domingo, Mitja Lustrek, Markus Scholz, Maciej Rosolowski, Marko Jordan, Bettina Benczik, Bence Ágg, Péter Ferdinandy, Andrew H Baker, Guy Fagherazzi, Markus Ollert, Joanna Michel, Gabriel Sanchez, Hüseyin Firat, Timo Brandenburger, Fabio Martelli, Lina Badimon, Yvan Devaux, COVIRNA consortium (www.covirna.eu)

{"title":"Prediction of COVID-19 severity using machine learning","authors":"Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Lu Zhang, Andrew I Lumley, Pranay Shah, Muhammad Shoaib, Venkata Satagopam, Prashant Kumar Srivastava, Costanza Emanueli, Simona Greco, Alisia Madè, Teresa Padro, Pedro Domingo, Mitja Lustrek, Markus Scholz, Maciej Rosolowski, Marko Jordan, Bettina Benczik, Bence Ágg, Péter Ferdinandy, Andrew H Baker, Guy Fagherazzi, Markus Ollert, Joanna Michel, Gabriel Sanchez, Hüseyin Firat, Timo Brandenburger, Fabio Martelli, Lina Badimon, Yvan Devaux, COVIRNA consortium (www.covirna.eu)","doi":"10.1002/ctm2.70042","DOIUrl":null,"url":null,"abstract":"Dear Editor,Prediction of COVID-19 severity is a critical task in the decision-making process during the initial stages of the disease, enabling personalised surveillance and care of COVID-19 patients. To develop a machine learning (ML) model for the prediction of COVID-19 severity, a consortium of 15 institutions from 12 European countries analysed expression data of 2906 blood long noncoding RNAs (lncRNAs) and clinical data collected from four independent cohorts, totalling 564 patients with COVID-19. This predictive model based on age and five lncRNAs predicted disease severity with an area under the receiver operating characteristic curve (AUC) of .875 [.868–.881] and an accuracy of .783 [.775–.791].The sudden onset of the COVID-19 pandemic caught the world unprepared, leading to more than 774 million confirmed cases and over 7 million reported deaths worldwide (over a period from January 2020 to March 2024), according to the World Health Organization (WHO).1 Other than having an impact on the respiratory system, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can also infect nonpulmonary cells such as cardiac and brain cells leading to cardiovascular or neurological symptoms.2 With the recent advances in high throughput sequencing, a large number of RNA signatures have emerged as promising biomarkers involved in the progression of various diseases, including cardiovascular diseases.3 As a response to the COVID-19 pandemic, partners of the EU-CardioRNA COST Action network4-6 joined forces in the H2020-funded COVIRNA project to develop an RNA-based diagnostic test using artificial intelligence (AI) that can help predict clinical outcomes after COVID-19.7 We chose to implement a targeted sequencing approach using the FIMICS panel of 2906 cardiac-enriched or heart failure-associated lncRNAs previously characterised by our consortium.8 In the present study, we aimed to apply the FIMICS panel to identify lncRNAs that will predict disease severity of COVID-19 patients. We used an approach based on ML to conduct the predictive analysis, as ML algorithms are suitably capable of analysing the complex relationships between biomedical data.9The overall workflow of the study is illustrated in Figure 1A. Briefly, four European cohorts were included in the study consisting of a total of 564 patients with COVID-19: the PrediCOVID cohort from Luxembourg (n = 162; recruitment period May 2020 to present), the COVID19_OMICS-COVIRNA cohort from Italy (n = 100; recruitment period March 2020 to January 2021), the TOCOVID cohort from Spain (n = 233; recruitment period April 2020 to June 2021), and the MiRCOVID cohort from Germany (n = 69; recruitment period April 2020 to November 2021). Patient characteristics are presented in Table 1. Plasma samples collected from patients at baseline were stored at −80°C in a central NF S96-900-certified Biobank at Firalis SA. Samples were then processed using the following workflow: RNA extraction, quality check, library preparation, and analysis by targeted sequencing using the FIMICS panel. Overall, 463 datasets representing each unique patient from four independent cohorts were available for the present analysis (Figure 1B).The 463 datasets were then used in a ML workflow to identify the most important predictors (lncRNAs and clinical variables) and to build a model predicting disease severity of COVID-19 patients in balanced (Figure 2A) and imbalanced (Figure 2B) datasets. Briefly, the available data was split into training and validation sets (80/20 split), then feature selection was performed on the training set—for features to be selected they had to appear in 90 out of the 100 iterations. The selected features were included in a model which was then evaluated using the validation set before the final model with the highest predictive capacity (highest AUC) was chosen. Using the described method, we identified six features as best predictors of COVID-19 severity which were selected in more than 90 out of 100 iterations (Figure 3A). Cross-validation of the selected features was also performed using 2 biostatistical methods (GLMnet and Stability selection; Figure 3B). The six features identified were age and five lncRNAs: SEQ0548 (LINC01088-201), SEQ0817 (FGD5-AS1), SEQ1056 (LINC01088-209), SEQ3051 (an unannotated lncRNA, henceforth referred to as lncCOVIRNA1) and SEQ1321 (AKAP13-SI). Box/violin plots of the selected predictors (Figure 4A–F) show significant (p < .001) differences between patients in the critical and stable groups.Table 2 presents results on the balanced dataset using the six selected features (age, LINC01088-201, FGD5-AS1, LINC01088-209, lncCOVIRNA1 and AKAP13-SI) across multiple ML models (Naïve Bayes, Logistic Regression, Extreme gradient boosting, Support Vector Machine, Multilayer Perceptron, K-Nearest Neighbours). We also built and evaluated the performance of ML models using only age as a predictor (Table S1) and using only the five selected lncRNAs (Table S2). Overall, the best results were obtained using all six selected features (age and the five lncRNAs) in the Naïve Bayes model which allowed an AUC of .875 (95% CI .868–.881) and an accuracy of .783 (95% CI .775–.791, Table 2 and Figure S1).The developed ML model can be used as an integral part of the development of a molecular diagnostic assay utilising routinely available quantitative PCR methods to quantify blood levels of the five lncRNAs to be used as input to the ML model for COVID-19 severity prediction. Together with another whole blood-based ML algorithm,10 the use of the present ML model based on plasma samples could have significant clinical implications, for instance by selecting high-risk patients for tailored treatment. An advantage of the present method is that it allows faster risk stratification of patients for decision making, which is especially useful during a pandemic, and is based on a widely used plasma sample. LncRNAs can be easily and quickly (2 h) measured in a noninvasive plasma sample. The increasing interest of the biomedical community on RNA molecules to treat or vaccinate patients could be followed by approval of circulating RNAs as disease biomarkers for personalised medicine, coupled with artificial intelligence methods.7 Moreover, identification of novel disease biomarkers could enhance our knowledge of the mechanisms leading to adverse outcomes or death, which could pave the way to the development of new therapies or repurposing of existing ones.Taken together, these findings could have significant clinical value to predict disease severity and help to improve the management and outcomes of COVID-19 patients.All authors are members of the COVIRNA consortium who conducted the present study. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Fabio Martelli, Yvan Devaux and Lu Zhang designed the research study. Yvan Devaux acquired funding. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic and Lu Zhang conducted the experiments. Pranay Shah conducted GLMNet and SS analysis. Muhammad Shoaib curated the data. Lu Zhang and Muhamed Adilovic preprocessed the data. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Yvan Devaux, Lu Zhang, Andrew I Lumley, Pranay Shah, Muhammad Shoaib, Prashant Kumar Srivastava, Mitja Lustrek, Maciej Rosolowski, Marko Jordan and Bettina Benczik analysed data. Firalis staff (Joanna Michel, Gabriel Sanchez, Hüseyin Firat) were responsible for sample storage and RNA extraction, library preparation, targeted RNA sequencing and raw data analysis. Kanita Karaduzovic-Hadziabdic wrote the draft and the final manuscript. Yvan Devaux supervised the writing of the manuscript and critically revised it for important intellectual content. Muhamed Adilovic, Yvan Devaux, Andrew I Lumley, Pranay Shah, Hüseyin Firat and Joanna Michel revised the manuscript and provided comments and wrote parts of the final manuscript. Muhamed Adilovic and Lu Zhang prepared the figures and tables. Yvan Devaux, Fabio Martelli, Alisia Madè, Simona Greco, Lina Badimon, Teresa Padro, Pedro Domingo, Timo Brandenburger, Guy Fagherazzi and Markus Ollert participated in acquiring patient samples and data. All the authors revised draft manuscript and approved the final version of the manuscript.MA is co-first author, who together with KK-H conducted the majority of the experiments and other contributions as noted above. The order among cVo-first authors was assigned based on contributions.YD holds patents and licensing agreements related to the use of RNAs for diagnostic and therapeutic purposes (WO2018229046, licensed to Firalis SA, protecting the use of lncRNAs in the FIMICS panel used for RNAseq in the present paper; other patents and licenses are not related to the present work). YD is Scientific Advisory Board member of Firalis SA.PF is the founder and CEO of Pharmahungary Group, a group of R&D companies.LB declares to have acted as a SAB member of Sanofi, Ionnis, MSD and NovoNordisk; to have received speaker fees from Sanofi, Bayer and AB-Biotics SA and to have founded the spin-off Ivastatin Therapeutics S.L. (all unrelated to this work).TP declares to have received speaker fees from AB-Biotics SA and to be a co-founder of the Spin-off Ivastatin Therapeutics SL (all unrelated to this work).MS received funding from Pfizer Inc. and from Owkin for projects not related to this research.HF is the founder and owner of Firalis SA, a company commercialising the FIMICS panel. He holds patents and licenses for the use of RNAs as biomarkers and therapeutic targets.All other authors declare no competing interests.This work was supported by the EU Horizon 2020 project COVIRNA awarded to YD (grant agreement # 101016072).The Predi-COVID study was supported by the Luxembourg National Research Fund (FNR) (Predi-COVID, grant number 14716273), the André Losch Foundation and by European Regional Development Fund (FEDER, convention 2018-04-026-21).YD is funded by the EU Horizon 2020 project COVIRNA (grant agreement # 101016072), the National Research Fund (grants # C14/BM/8225223, C17/BM/11613033 and COVID-19/2020-1/14719577/miRCOVID), the Ministry of Higher Education and Research, and the Heart Foundation-Daniel Wagner of Luxembourg.FM is supported by the Italian Ministry of Health (Ricerca Corrente 2024 1.07.128, RF-2019-12368521 and POS T4 CAL.HUB.RIA cod. T4-AN-09), EU COVIRNA agreement #101016072, Next Generation EU PNRR M6C2 Inv. 2.1 PNRR-MAD-2022-12375790 and PNRR/2022/C9/MCID/I8 FibroThera.Horizon 2020 Framework Programme 101016072, André Losch Fondation, Heart Foundation-Daniel Wagner of Luxembourg, Ministero della Salute POS T4 CAL.HUB.RIA cod. T4-AN-09, RF-2019-12368521, Ricerca Corrente 2024 1.07.128, Fonds National de la Recherche Luxembourg C14/BM/8225223, C17/BM/11613033, COVID-19/2020-1/14719577/miRCOVID, Next Generation EU, European Regional Development Fund FEDER, convention 2018-04-026-21, Ministère de l'Education Nationale, de l'Enseignement Superieur et de la Recherche. P.F. and B.Á. were funded by project no. RRF-2.3.1-21-2022-00003 that has been implemented with the support provided by the European Union. The 2020-1.1.5-GYORSÍTÓSÁV-2021-00011 project was funded by the Ministry for Innovation and Technology with support from the National Research Development and Innovation Fund under the 2020-1.1.5-GYORSÍTÓSÁV call programme. This study was funded by the grant 2020-1.1.6-JÖVŐ-2021-00013 (\t“Befektetés a jÖvŐbe” NKFIH). This project has received funding from the HUN-REN Hungarian Research Network.The ML code is available as a Supplementary File and is accessible on the GitHub repository at the following link https://github.com/madilovic/COVIRNA_plasma using ID: 118e2ccd07df8b10b7fc52df95ae11b52bb8216a.This study was performed in full compliance with the Declaration of Helsinki. The study involved four cohorts comprising COVID-19-positive patients aged 18 years and older from Luxembourg (PrediCOVID study), Italy (COVID19_OMICS—COVIRNA study), Spain (TOCOVID study), and Germany (MiRCOVID study). The Luxembourg PrediCOVID study was approved by the National Research Ethics Committee of Luxembourg (study Number 202003/07) and was registered under ClinicalTrials.gov (NCT04380987). The COVID19_OMICS—COVIRNA study was approved by the Institutional Ethics Committee of the San Raffaele Hospital (protocol number 75/INT/2020, 20/04/2020 and subsequent modification dated 16/12/2020) and was registered with ID NCT04441502. The TOCOVID study was approved by the Research Ethics Committee of the Hospital Santa Creu i Sant Pau, Barcelona (Ref number 21/036) and was registered under ClinicalTrials.gov (NCT04332094). The MiRCOVID study was approved by the Research Ethics Committee of Duesseldorf University (internal study number 2020−912) and was registered under ClinicalTrials.gov (NCT04381351). Periods of patient enrolment and biological samples collection were May 2020 to present for PrediCOVID, March 2020 to January 2021 for COVID19_OMICS-COVIRNA, April 2020 to June 2021 for TOCOVID, April 2020 to November 2021. Informed consent was signed by all patients enrolled in these studies. Legal agreements for material and data sharing have been signed between each cohort and COVIRNA project coordinator Luxembourg Institute of Health (LIH).","PeriodicalId":10189,"journal":{"name":"Clinical and Translational Medicine","volume":"14 10","pages":""},"PeriodicalIF":7.9000,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11456675/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ctm2.70042","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Dear Editor,

Prediction of COVID-19 severity is a critical task in the decision-making process during the initial stages of the disease, enabling personalised surveillance and care of COVID-19 patients. To develop a machine learning (ML) model for the prediction of COVID-19 severity, a consortium of 15 institutions from 12 European countries analysed expression data of 2906 blood long noncoding RNAs (lncRNAs) and clinical data collected from four independent cohorts, totalling 564 patients with COVID-19. This predictive model based on age and five lncRNAs predicted disease severity with an area under the receiver operating characteristic curve (AUC) of .875 [.868–.881] and an accuracy of .783 [.775–.791].

The sudden onset of the COVID-19 pandemic caught the world unprepared, leading to more than 774 million confirmed cases and over 7 million reported deaths worldwide (over a period from January 2020 to March 2024), according to the World Health Organization (WHO).¹ Other than having an impact on the respiratory system, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can also infect nonpulmonary cells such as cardiac and brain cells leading to cardiovascular or neurological symptoms.² With the recent advances in high throughput sequencing, a large number of RNA signatures have emerged as promising biomarkers involved in the progression of various diseases, including cardiovascular diseases.³ As a response to the COVID-19 pandemic, partners of the EU-CardioRNA COST Action network^4-6 joined forces in the H2020-funded COVIRNA project to develop an RNA-based diagnostic test using artificial intelligence (AI) that can help predict clinical outcomes after COVID-19.⁷ We chose to implement a targeted sequencing approach using the FIMICS panel of 2906 cardiac-enriched or heart failure-associated lncRNAs previously characterised by our consortium.⁸ In the present study, we aimed to apply the FIMICS panel to identify lncRNAs that will predict disease severity of COVID-19 patients. We used an approach based on ML to conduct the predictive analysis, as ML algorithms are suitably capable of analysing the complex relationships between biomedical data.⁹

The overall workflow of the study is illustrated in Figure 1A. Briefly, four European cohorts were included in the study consisting of a total of 564 patients with COVID-19: the PrediCOVID cohort from Luxembourg (n = 162; recruitment period May 2020 to present), the COVID19_OMICS-COVIRNA cohort from Italy (n = 100; recruitment period March 2020 to January 2021), the TOCOVID cohort from Spain (n = 233; recruitment period April 2020 to June 2021), and the MiRCOVID cohort from Germany (n = 69; recruitment period April 2020 to November 2021). Patient characteristics are presented in Table 1. Plasma samples collected from patients at baseline were stored at −80°C in a central NF S96-900-certified Biobank at Firalis SA. Samples were then processed using the following workflow: RNA extraction, quality check, library preparation, and analysis by targeted sequencing using the FIMICS panel. Overall, 463 datasets representing each unique patient from four independent cohorts were available for the present analysis (Figure 1B).

The 463 datasets were then used in a ML workflow to identify the most important predictors (lncRNAs and clinical variables) and to build a model predicting disease severity of COVID-19 patients in balanced (Figure 2A) and imbalanced (Figure 2B) datasets. Briefly, the available data was split into training and validation sets (80/20 split), then feature selection was performed on the training set—for features to be selected they had to appear in 90 out of the 100 iterations. The selected features were included in a model which was then evaluated using the validation set before the final model with the highest predictive capacity (highest AUC) was chosen. Using the described method, we identified six features as best predictors of COVID-19 severity which were selected in more than 90 out of 100 iterations (Figure 3A). Cross-validation of the selected features was also performed using 2 biostatistical methods (GLMnet and Stability selection; Figure 3B). The six features identified were age and five lncRNAs: SEQ0548 (LINC01088-201), SEQ0817 (FGD5-AS1), SEQ1056 (LINC01088-209), SEQ3051 (an unannotated lncRNA, henceforth referred to as lncCOVIRNA1) and SEQ1321 (AKAP13-SI). Box/violin plots of the selected predictors (Figure 4A–F) show significant (p < .001) differences between patients in the critical and stable groups.

Table 2 presents results on the balanced dataset using the six selected features (age, LINC01088-201, FGD5-AS1, LINC01088-209, lncCOVIRNA1 and AKAP13-SI) across multiple ML models (Naïve Bayes, Logistic Regression, Extreme gradient boosting, Support Vector Machine, Multilayer Perceptron, K-Nearest Neighbours). We also built and evaluated the performance of ML models using only age as a predictor (Table S1) and using only the five selected lncRNAs (Table S2). Overall, the best results were obtained using all six selected features (age and the five lncRNAs) in the Naïve Bayes model which allowed an AUC of .875 (95% CI .868–.881) and an accuracy of .783 (95% CI .775–.791, Table 2 and Figure S1).

The developed ML model can be used as an integral part of the development of a molecular diagnostic assay utilising routinely available quantitative PCR methods to quantify blood levels of the five lncRNAs to be used as input to the ML model for COVID-19 severity prediction. Together with another whole blood-based ML algorithm,¹⁰ the use of the present ML model based on plasma samples could have significant clinical implications, for instance by selecting high-risk patients for tailored treatment. An advantage of the present method is that it allows faster risk stratification of patients for decision making, which is especially useful during a pandemic, and is based on a widely used plasma sample. LncRNAs can be easily and quickly (2 h) measured in a noninvasive plasma sample. The increasing interest of the biomedical community on RNA molecules to treat or vaccinate patients could be followed by approval of circulating RNAs as disease biomarkers for personalised medicine, coupled with artificial intelligence methods.⁷ Moreover, identification of novel disease biomarkers could enhance our knowledge of the mechanisms leading to adverse outcomes or death, which could pave the way to the development of new therapies or repurposing of existing ones.

Taken together, these findings could have significant clinical value to predict disease severity and help to improve the management and outcomes of COVID-19 patients.

All authors are members of the COVIRNA consortium who conducted the present study. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Fabio Martelli, Yvan Devaux and Lu Zhang designed the research study. Yvan Devaux acquired funding. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic and Lu Zhang conducted the experiments. Pranay Shah conducted GLMNet and SS analysis. Muhammad Shoaib curated the data. Lu Zhang and Muhamed Adilovic preprocessed the data. Kanita Karaduzovic-Hadziabdic, Muhamed Adilovic, Yvan Devaux, Lu Zhang, Andrew I Lumley, Pranay Shah, Muhammad Shoaib, Prashant Kumar Srivastava, Mitja Lustrek, Maciej Rosolowski, Marko Jordan and Bettina Benczik analysed data. Firalis staff (Joanna Michel, Gabriel Sanchez, Hüseyin Firat) were responsible for sample storage and RNA extraction, library preparation, targeted RNA sequencing and raw data analysis. Kanita Karaduzovic-Hadziabdic wrote the draft and the final manuscript. Yvan Devaux supervised the writing of the manuscript and critically revised it for important intellectual content. Muhamed Adilovic, Yvan Devaux, Andrew I Lumley, Pranay Shah, Hüseyin Firat and Joanna Michel revised the manuscript and provided comments and wrote parts of the final manuscript. Muhamed Adilovic and Lu Zhang prepared the figures and tables. Yvan Devaux, Fabio Martelli, Alisia Madè, Simona Greco, Lina Badimon, Teresa Padro, Pedro Domingo, Timo Brandenburger, Guy Fagherazzi and Markus Ollert participated in acquiring patient samples and data. All the authors revised draft manuscript and approved the final version of the manuscript.

MA is co-first author, who together with KK-H conducted the majority of the experiments and other contributions as noted above. The order among cVo-first authors was assigned based on contributions.

YD holds patents and licensing agreements related to the use of RNAs for diagnostic and therapeutic purposes (WO2018229046, licensed to Firalis SA, protecting the use of lncRNAs in the FIMICS panel used for RNAseq in the present paper; other patents and licenses are not related to the present work). YD is Scientific Advisory Board member of Firalis SA.

PF is the founder and CEO of Pharmahungary Group, a group of R&D companies.

LB declares to have acted as a SAB member of Sanofi, Ionnis, MSD and NovoNordisk; to have received speaker fees from Sanofi, Bayer and AB-Biotics SA and to have founded the spin-off Ivastatin Therapeutics S.L. (all unrelated to this work).

TP declares to have received speaker fees from AB-Biotics SA and to be a co-founder of the Spin-off Ivastatin Therapeutics SL (all unrelated to this work).

MS received funding from Pfizer Inc. and from Owkin for projects not related to this research.

HF is the founder and owner of Firalis SA, a company commercialising the FIMICS panel. He holds patents and licenses for the use of RNAs as biomarkers and therapeutic targets.

All other authors declare no competing interests.

This work was supported by the EU Horizon 2020 project COVIRNA awarded to YD (grant agreement # 101016072).

The Predi-COVID study was supported by the Luxembourg National Research Fund (FNR) (Predi-COVID, grant number 14716273), the André Losch Foundation and by European Regional Development Fund (FEDER, convention 2018-04-026-21).

YD is funded by the EU Horizon 2020 project COVIRNA (grant agreement # 101016072), the National Research Fund (grants # C14/BM/8225223, C17/BM/11613033 and COVID-19/2020-1/14719577/miRCOVID), the Ministry of Higher Education and Research, and the Heart Foundation-Daniel Wagner of Luxembourg.

FM is supported by the Italian Ministry of Health (Ricerca Corrente 2024 1.07.128, RF-2019-12368521 and POS T4 CAL.HUB.RIA cod. T4-AN-09), EU COVIRNA agreement #101016072, Next Generation EU PNRR M6C2 Inv. 2.1 PNRR-MAD-2022-12375790 and PNRR/2022/C9/MCID/I8 FibroThera.

Horizon 2020 Framework Programme 101016072, André Losch Fondation, Heart Foundation-Daniel Wagner of Luxembourg, Ministero della Salute POS T4 CAL.HUB.RIA cod. T4-AN-09, RF-2019-12368521, Ricerca Corrente 2024 1.07.128, Fonds National de la Recherche Luxembourg C14/BM/8225223, C17/BM/11613033, COVID-19/2020-1/14719577/miRCOVID, Next Generation EU, European Regional Development Fund FEDER, convention 2018-04-026-21, Ministère de l'Education Nationale, de l'Enseignement Superieur et de la Recherche. P.F. and B.Á. were funded by project no. RRF-2.3.1-21-2022-00003 that has been implemented with the support provided by the European Union. The 2020-1.1.5-GYORSÍTÓSÁV-2021-00011 project was funded by the Ministry for Innovation and Technology with support from the National Research Development and Innovation Fund under the 2020-1.1.5-GYORSÍTÓSÁV call programme. This study was funded by the grant 2020-1.1.6-JÖVŐ-2021-00013 ( “Befektetés a jÖvŐbe” NKFIH). This project has received funding from the HUN-REN Hungarian Research Network.

The ML code is available as a Supplementary File and is accessible on the GitHub repository at the following link https://github.com/madilovic/COVIRNA_plasma using ID: 118e2ccd07df8b10b7fc52df95ae11b52bb8216a.

This study was performed in full compliance with the Declaration of Helsinki. The study involved four cohorts comprising COVID-19-positive patients aged 18 years and older from Luxembourg (PrediCOVID study), Italy (COVID19_OMICS—COVIRNA study), Spain (TOCOVID study), and Germany (MiRCOVID study). The Luxembourg PrediCOVID study was approved by the National Research Ethics Committee of Luxembourg (study Number 202003/07) and was registered under ClinicalTrials.gov (NCT04380987). The COVID19_OMICS—COVIRNA study was approved by the Institutional Ethics Committee of the San Raffaele Hospital (protocol number 75/INT/2020, 20/04/2020 and subsequent modification dated 16/12/2020) and was registered with ID NCT04441502. The TOCOVID study was approved by the Research Ethics Committee of the Hospital Santa Creu i Sant Pau, Barcelona (Ref number 21/036) and was registered under ClinicalTrials.gov (NCT04332094). The MiRCOVID study was approved by the Research Ethics Committee of Duesseldorf University (internal study number 2020−912) and was registered under ClinicalTrials.gov (NCT04381351). Periods of patient enrolment and biological samples collection were May 2020 to present for PrediCOVID, March 2020 to January 2021 for COVID19_OMICS-COVIRNA, April 2020 to June 2021 for TOCOVID, April 2020 to November 2021. Informed consent was signed by all patients enrolled in these studies. Legal agreements for material and data sharing have been signed between each cohort and COVIRNA project coordinator Luxembourg Institute of Health (LIH).

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用机器学习预测 COVID-19 的严重程度。

我们还建立并评估了仅使用年龄作为预测因子（表 S1）和仅使用五个选定的 lncRNAs（表 S2）的 ML 模型的性能。总的来说，在奈夫贝叶斯模型中使用所有六个选定特征（年龄和五个 lncRNA）得到的结果最好，其 AUC 为 0.875（95% CI 0.868-0.881），准确率为 0.783（95% CI 0.775-0.791，表 2 和图 S1）。所开发的 ML 模型可作为开发分子诊断检测的一个组成部分，利用现有的常规定量 PCR 方法定量检测血液中五种 lncRNA 的水平，作为 COVID-19 严重程度预测 ML 模型的输入。与另一种基于全血的 ML 算法10 一起，基于血浆样本的本 ML 模型的使用可能具有重要的临床意义，例如选择高风险患者进行有针对性的治疗。本方法的一个优点是可以更快地对患者进行风险分层，以便做出决策，这在大流行病期间尤其有用，而且它是基于广泛使用的血浆样本。LncRNA 可在无创血浆样本中轻松快速（2 小时）测量。生物医学界对用于治疗或接种疫苗的 RNA 分子的兴趣与日俱增，循环 RNA 作为个性化医疗的疾病生物标志物，再加上人工智能方法，可能会随之获得批准。此外，新型疾病生物标志物的鉴定可增强我们对导致不良后果或死亡机制的认识，从而为开发新的疗法或重新利用现有疗法铺平道路。总之，这些发现对预测疾病严重程度具有重要的临床价值，有助于改善 COVID-19 患者的管理和预后。Kanita Karaduzovic-Hadziabdic、Muhamed Adilovic、Fabio Martelli、Yvan Devaux和张璐设计了这项研究。Yvan Devaux 获得了资助。Kanita Karaduzovic-Hadziabdic、Muhamed Adilovic 和 Lu Zhang 进行了实验。Pranay Shah 进行了 GLMNet 和 SS 分析。Muhammad Shoaib对数据进行了整理。Lu Zhang 和 Muhamed Adilovic 对数据进行了预处理。Kanita Karaduzovic-Hadziabdic、Muhamed Adilovic、Yvan Devaux、张璐、Andrew I Lumley、Pranay Shah、Muhammad Shoaib、Prashant Kumar Srivastava、Mitja Lustrek、Maciej Rosolowski、Marko Jordan 和 Bettina Benczik 分析了数据。菲拉里斯公司的员工（乔安娜-米歇尔、加布里埃尔-桑切斯、胡赛因-菲拉特）负责样本储存和 RNA 提取、文库制备、靶向 RNA 测序和原始数据分析。Kanita Karaduzovic-Hadziabdic撰写了草稿和最终手稿。伊万-德沃（Yvan Devaux）指导了手稿的撰写，并对重要的知识性内容进行了严格修改。Muhamed Adilovic、Yvan Devaux、Andrew I Lumley、Pranay Shah、Hüseyin Firat 和 Joanna Michel 对手稿进行了修改，提出了意见，并撰写了最终手稿的部分内容。Muhamed Adilovic 和 Lu Zhang 准备了图表。Yvan Devaux、Fabio Martelli、Alisia Madè、Simona Greco、Lina Badimon、Teresa Padro、Pedro Domingo、Timo Brandenburger、Guy Fagherazzi 和 Markus Ollert 参与了患者样本和数据的采集。MA 是共同第一作者，与 KK-H 一起完成了大部分实验，并做出了上述其他贡献。YD持有与将RNA用于诊断和治疗目的相关的专利和许可协议（WO2018229046，许可给Firalis SA，保护本文中用于RNAseq的FIMICS面板中lncRNA的使用；其他专利和许可与本文工作无关）。YD 是 Firalis SA 的科学顾问委员会成员。PF 是研发公司集团 Pharmahungary Group 的创始人兼首席执行官。LB 声明曾担任赛诺菲、Ionnis、MSD 和 NovoNordisk 的科学顾问委员会成员；从赛诺菲、拜耳和 AB-Biotics SA 领取演讲费，并成立了分拆公司 Ivastatin Therapeutics S.L.（均与本工作无关）。(TP声明从AB-Biotics SA公司获得演讲费，并且是分拆公司Ivastatin Therapeutics SL的共同创始人（均与本研究无关）。HF是Firalis SA公司的创始人和所有人，该公司是FIMICS面板的商业化公司。他拥有将RNA用作生物标记物和治疗靶点的专利和许可。 Predi-COVID研究得到了卢森堡国家研究基金（FNR）（Predi-COVID，拨款号14716273）、安德烈-洛施基金会（André Losch Foundation）和欧洲地区发展基金（FEDER，协议号2018-04-026-21）的支持。YD由欧盟地平线2020项目COVIRNA（赠款协议编号101016072）、国家研究基金（赠款编号C14/BM/8225223、C17/BM/11613033和COVID-19/2020-1/14719577/miRCOVID）、高等教育与研究部和卢森堡心脏基金会-丹尼尔-瓦格纳资助。FM由意大利卫生部（Ricerca Corrente 2024 1.07.128、RF-2019-12368521和POS T4 CAL.HUB.RIA代码T4-AN-09）、欧盟地平线2020项目（赠款协议编号101016072）、国家研究基金（赠款编号C14/BM/8225223、C17/BM/11613033和COVID-19/2020-1/14719577/miRCOVID）、高等教育与研究部和卢森堡心脏基金会-丹尼尔-瓦格纳资助。T4-AN-09), EU COVIRNA agreement #101016072, Next Generation EU PNRR M6C2 Inv. 2.1 PNRR-MAD-2022-12375790 and PNRR/2022/C9/MCID/I8 FibroThera.Horizon 2020 Framework Programme 101016072, André Losch Fondation, Heart Foundation-Daniel Wagner of Luxembourg, Ministero della Salute POS T4 CAL.HUB.RIA cod.T4-AN-09, RF-2019-12368521, Ricerca Corrente 2024 1.07.128, Fonds National de la Recherche Luxembourg C14/BM/8225223, C17/BM/11613033, COVID-19/2020-1/14719577/miRCOVID, Next Generation EU, European Regional Development Fund FEDER, convention 2018-04-026-21, Ministère de l'Education Nationale, de l'Enseignement Superieur et de la Recherche.P.F.和B.Á.由项目编号为RRF-2.3.1-21的项目资助。P.F.和B.Á.得到了RRF-2.3.1-21-2022-00003号项目的资助，该项目是在欧盟的支持下实施的。2020-1.1.5-GYORSÍTÓSÁV-2021-00011项目由创新和技术部资助，国家研究发展和创新基金根据2020-1.1.5-GYORSÍTÓSÁV号召计划提供支持。本研究得到了 2020-1.1.6-JÖVŐ-2021-00013 （"Befektetés a jÖvŐbe" NKFIH）基金的资助。本项目得到了 HUN-REN 匈牙利研究网络的资助。ML 代码作为补充文件提供，可通过以下链接访问 GitHub 存储库 https://github.com/madilovic/COVIRNA_plasma，使用 ID：118e2ccd07df8b10b7fc52df95ae11b52bb8216a。本研究完全按照《赫尔辛基宣言》进行。研究涉及四个队列，包括来自卢森堡（PrediCOVID 研究）、意大利（COVID19_OMICS-COVIRNA 研究）、西班牙（TOCOVID 研究）和德国（MiRCOVID 研究）的 18 岁及以上 COVID-19 阳性患者。卢森堡 PrediCOVID 研究获得了卢森堡国家研究伦理委员会的批准（研究编号 202003/07），并在 ClinicalTrials.gov 上注册（NCT04380987）。COVID19_OMICS-COVIRNA研究获得了圣拉法尔医院机构伦理委员会的批准（研究方案编号75/INT/2020，2020年4月20日，随后的修改日期为2020年12月16日），注册号为NCT04441502。TOCOVID研究获得了巴塞罗那圣克鲁-圣保罗医院研究伦理委员会的批准（编号21/036），并在ClinicalTrials.gov（NCT04332094）上进行了注册。MiRCOVID研究获得了杜塞尔多夫大学研究伦理委员会的批准（内部研究编号2020-912），并在ClinicalTrials.gov上进行了注册（NCT04381351）。PrediCOVID 的患者注册和生物样本采集时间为 2020 年 5 月至今，COVID19_OMICS-COVIRNA 的患者注册和生物样本采集时间为 2020 年 3 月至 2021 年 1 月，TOCOVID 的患者注册和生物样本采集时间为 2020 年 4 月至 2021 年 6 月，TOCOVID 的患者注册和生物样本采集时间为 2020 年 4 月至 2021 年 11 月。所有参与这些研究的患者都签署了知情同意书。每个队列与 COVIRNA 项目协调方卢森堡卫生研究院 (LIH) 签署了材料和数据共享法律协议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Clinical and Translational Medicine Multiple-

CiteScore

15.90

自引率

1.90%

发文量

450

审稿时长

4 weeks

期刊介绍： Clinical and Translational Medicine (CTM) is an international, peer-reviewed, open-access journal dedicated to accelerating the translation of preclinical research into clinical applications and fostering communication between basic and clinical scientists. It highlights the clinical potential and application of various fields including biotechnologies, biomaterials, bioengineering, biomarkers, molecular medicine, omics science, bioinformatics, immunology, molecular imaging, drug discovery, regulation, and health policy. With a focus on the bench-to-bedside approach, CTM prioritizes studies and clinical observations that generate hypotheses relevant to patients and diseases, guiding investigations in cellular and molecular medicine. The journal encourages submissions from clinicians, researchers, policymakers, and industry professionals.