scCaT: An explainable capsulating architecture for sepsis diagnosis transferring from single-cell RNA sequencing.

IF 3.6 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS PLoS Computational Biology Pub Date : 2024-10-21 eCollection Date: 2024-10-01 DOI:10.1371/journal.pcbi.1012083

Xubin Zheng, Dian Meng, Duo Chen, Wan-Ki Wong, Ka-Ho To, Lei Zhu, JiaFei Wu, Yining Liang, Kwong-Sak Leung, Man-Hon Wong, Lixin Cheng

{"title":"scCaT: An explainable capsulating architecture for sepsis diagnosis transferring from single-cell RNA sequencing.","authors":"Xubin Zheng, Dian Meng, Duo Chen, Wan-Ki Wong, Ka-Ho To, Lei Zhu, JiaFei Wu, Yining Liang, Kwong-Sak Leung, Man-Hon Wong, Lixin Cheng","doi":"10.1371/journal.pcbi.1012083","DOIUrl":null,"url":null,"abstract":"<p><p>Sepsis is a life-threatening condition characterized by an exaggerated immune response to pathogens, leading to organ damage and high mortality rates in the intensive care unit. Although deep learning has achieved impressive performance on prediction and classification tasks in medicine, it requires large amounts of data and lacks explainability, which hinder its application to sepsis diagnosis. We introduce a deep learning framework, called scCaT, which blends the capsulating architecture with Transformer to develop a sepsis diagnostic model using single-cell RNA sequencing data and transfers it to bulk RNA data. The capsulating architecture effectively groups genes into capsules based on biological functions, which provides explainability in encoding gene expressions. The Transformer serves as a decoder to classify sepsis patients and controls. Our model achieves high accuracy with an AUROC of 0.93 on the single-cell test set and an average AUROC of 0.98 on seven bulk RNA cohorts. Additionally, the capsules can recognize different cell types and distinguish sepsis from control samples based on their biological pathways. This study presents a novel approach for learning gene modules and transferring the model to other data types, offering potential benefits in diagnosing rare diseases with limited subjects.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"20 10","pages":"e1012083"},"PeriodicalIF":3.6000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527285/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1012083","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Sepsis is a life-threatening condition characterized by an exaggerated immune response to pathogens, leading to organ damage and high mortality rates in the intensive care unit. Although deep learning has achieved impressive performance on prediction and classification tasks in medicine, it requires large amounts of data and lacks explainability, which hinder its application to sepsis diagnosis. We introduce a deep learning framework, called scCaT, which blends the capsulating architecture with Transformer to develop a sepsis diagnostic model using single-cell RNA sequencing data and transfers it to bulk RNA data. The capsulating architecture effectively groups genes into capsules based on biological functions, which provides explainability in encoding gene expressions. The Transformer serves as a decoder to classify sepsis patients and controls. Our model achieves high accuracy with an AUROC of 0.93 on the single-cell test set and an average AUROC of 0.98 on seven bulk RNA cohorts. Additionally, the capsules can recognize different cell types and distinguish sepsis from control samples based on their biological pathways. This study presents a novel approach for learning gene modules and transferring the model to other data types, offering potential benefits in diagnosing rare diseases with limited subjects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

scCaT：从单细胞 RNA 测序转移出的用于败血症诊断的可解释囊结构。

败血症是一种危及生命的疾病，其特点是对病原体的免疫反应过度，导致器官损伤和重症监护室的高死亡率。虽然深度学习在医学领域的预测和分类任务中取得了令人印象深刻的成绩，但它需要大量数据且缺乏可解释性，这阻碍了它在败血症诊断中的应用。我们介绍了一种名为 scCaT 的深度学习框架，它将 capsulating 架构与 Transformer 相结合，利用单细胞 RNA 测序数据开发出败血症诊断模型，并将其转移到大容量 RNA 数据中。capsulating 架构能根据生物功能有效地将基因归入囊中，从而为基因表达编码提供可解释性。变压器可作为解码器对败血症患者和对照组进行分类。我们的模型具有很高的准确性，在单细胞测试集上的 AUROC 为 0.93，在 7 个批量 RNA 队列上的平均 AUROC 为 0.98。此外，胶囊还能识别不同的细胞类型，并根据生物通路将败血症样本与对照样本区分开来。这项研究提出了一种学习基因模块并将模型转移到其他数据类型的新方法，为诊断受试者有限的罕见疾病提供了潜在的益处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS Computational Biology BIOCHEMICAL RESEARCH METHODS-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

7.10

自引率

4.70%

发文量

820

审稿时长

2.5 months

期刊介绍： PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.