{"title":"基于宿主基因表达秩的集成机器学习算法对急性细菌和病毒感染的鲁棒诊断:多队列模型开发和验证研究。","authors":"Yifei Shen,Dongsheng Han,Wenxin Qu,Fei Yu,Dan Zhang,Yifan Xu,Enhui Shen,Qinjie Chu,Michael P Timko,Longjiang Fan,Shufa Zheng,Yu Chen","doi":"10.1093/clinchem/hvae220","DOIUrl":null,"url":null,"abstract":"BACKGROUND\r\nThe accurate and prompt diagnosis of infections is essential for improving patient outcomes and preventing bacterial drug resistance. Host gene expression profiling as an approach to infection diagnosis holds great potential in assisting early and accurate diagnosis of infection.\r\n\r\nMETHODS\r\nTo improve the precision of infection diagnosis, we developed InfectDiagno, a rank-based ensemble machine learning algorithm for infection diagnosis via host gene expression patterns. Eleven data sets were used as training data sets for the method development, and the InfectDiagno algorithm was optimized by multi-cohort training samples. Nine data sets were used as independent validation data sets for the method. We further validated the diagnostic capacity of InfectDiagno in a prospective clinical cohort.\r\n\r\nRESULTS\r\nAfter selecting 100 feature genes based on their gene expression ranks for infection prediction, we trained a classifier using both a noninfected-vs-infected area under the receiver-operating characteristic curve (area under the curve [AUC] 0.95 [95% CI, 0.93-0.97]) and a bacterial-vs-viral AUC 0.95 (95% CI, 0.93-0.97). We then used the noninfected/infected classifier together with the bacterial/viral classifier to build a discriminating infection diagnosis model. The sensitivity was 0.931 and 0.872, and specificity 0.963 and 0.929, for bacterial and viral infections, respectively. We then applied InfectDiagno to a prospective clinical cohort (n = 517), and found it classified 95% of the samples correctly.\r\n\r\nCONCLUSIONS\r\nOur study shows that the InfectDiagno algorithm is a powerful and robust tool to accurately identify infection in a real-world patient population, which has the potential to profoundly improve clinical care in the field of infection diagnosis.","PeriodicalId":10690,"journal":{"name":"Clinical chemistry","volume":"10 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Diagnosis of Acute Bacterial and Viral Infections via Host Gene Expression Rank-Based Ensemble Machine Learning Algorithm: A Multi-Cohort Model Development and Validation Study.\",\"authors\":\"Yifei Shen,Dongsheng Han,Wenxin Qu,Fei Yu,Dan Zhang,Yifan Xu,Enhui Shen,Qinjie Chu,Michael P Timko,Longjiang Fan,Shufa Zheng,Yu Chen\",\"doi\":\"10.1093/clinchem/hvae220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BACKGROUND\\r\\nThe accurate and prompt diagnosis of infections is essential for improving patient outcomes and preventing bacterial drug resistance. Host gene expression profiling as an approach to infection diagnosis holds great potential in assisting early and accurate diagnosis of infection.\\r\\n\\r\\nMETHODS\\r\\nTo improve the precision of infection diagnosis, we developed InfectDiagno, a rank-based ensemble machine learning algorithm for infection diagnosis via host gene expression patterns. Eleven data sets were used as training data sets for the method development, and the InfectDiagno algorithm was optimized by multi-cohort training samples. Nine data sets were used as independent validation data sets for the method. We further validated the diagnostic capacity of InfectDiagno in a prospective clinical cohort.\\r\\n\\r\\nRESULTS\\r\\nAfter selecting 100 feature genes based on their gene expression ranks for infection prediction, we trained a classifier using both a noninfected-vs-infected area under the receiver-operating characteristic curve (area under the curve [AUC] 0.95 [95% CI, 0.93-0.97]) and a bacterial-vs-viral AUC 0.95 (95% CI, 0.93-0.97). We then used the noninfected/infected classifier together with the bacterial/viral classifier to build a discriminating infection diagnosis model. The sensitivity was 0.931 and 0.872, and specificity 0.963 and 0.929, for bacterial and viral infections, respectively. We then applied InfectDiagno to a prospective clinical cohort (n = 517), and found it classified 95% of the samples correctly.\\r\\n\\r\\nCONCLUSIONS\\r\\nOur study shows that the InfectDiagno algorithm is a powerful and robust tool to accurately identify infection in a real-world patient population, which has the potential to profoundly improve clinical care in the field of infection diagnosis.\",\"PeriodicalId\":10690,\"journal\":{\"name\":\"Clinical chemistry\",\"volume\":\"10 1\",\"pages\":\"\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2025-01-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical chemistry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/clinchem/hvae220\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICAL LABORATORY TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical chemistry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/clinchem/hvae220","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
Robust Diagnosis of Acute Bacterial and Viral Infections via Host Gene Expression Rank-Based Ensemble Machine Learning Algorithm: A Multi-Cohort Model Development and Validation Study.
BACKGROUND
The accurate and prompt diagnosis of infections is essential for improving patient outcomes and preventing bacterial drug resistance. Host gene expression profiling as an approach to infection diagnosis holds great potential in assisting early and accurate diagnosis of infection.
METHODS
To improve the precision of infection diagnosis, we developed InfectDiagno, a rank-based ensemble machine learning algorithm for infection diagnosis via host gene expression patterns. Eleven data sets were used as training data sets for the method development, and the InfectDiagno algorithm was optimized by multi-cohort training samples. Nine data sets were used as independent validation data sets for the method. We further validated the diagnostic capacity of InfectDiagno in a prospective clinical cohort.
RESULTS
After selecting 100 feature genes based on their gene expression ranks for infection prediction, we trained a classifier using both a noninfected-vs-infected area under the receiver-operating characteristic curve (area under the curve [AUC] 0.95 [95% CI, 0.93-0.97]) and a bacterial-vs-viral AUC 0.95 (95% CI, 0.93-0.97). We then used the noninfected/infected classifier together with the bacterial/viral classifier to build a discriminating infection diagnosis model. The sensitivity was 0.931 and 0.872, and specificity 0.963 and 0.929, for bacterial and viral infections, respectively. We then applied InfectDiagno to a prospective clinical cohort (n = 517), and found it classified 95% of the samples correctly.
CONCLUSIONS
Our study shows that the InfectDiagno algorithm is a powerful and robust tool to accurately identify infection in a real-world patient population, which has the potential to profoundly improve clinical care in the field of infection diagnosis.
期刊介绍:
Clinical Chemistry is a peer-reviewed scientific journal that is the premier publication for the science and practice of clinical laboratory medicine. It was established in 1955 and is associated with the Association for Diagnostics & Laboratory Medicine (ADLM).
The journal focuses on laboratory diagnosis and management of patients, and has expanded to include other clinical laboratory disciplines such as genomics, hematology, microbiology, and toxicology. It also publishes articles relevant to clinical specialties including cardiology, endocrinology, gastroenterology, genetics, immunology, infectious diseases, maternal-fetal medicine, neurology, nutrition, oncology, and pediatrics.
In addition to original research, editorials, and reviews, Clinical Chemistry features recurring sections such as clinical case studies, perspectives, podcasts, and Q&A articles. It has the highest impact factor among journals of clinical chemistry, laboratory medicine, pathology, analytical chemistry, transfusion medicine, and clinical microbiology.
The journal is indexed in databases such as MEDLINE and Web of Science.