Single-cell RNA-seq data clustering by deep information fusion.

IF 2.5 3区 生物学 Q3 BIOTECHNOLOGY & APPLIED MICROBIOLOGY Briefings in Functional Genomics Pub Date : 2024-03-20 DOI:10.1093/bfgp/elad017
Liangrui Ren, Jun Wang, Wei Li, Maozu Guo, Guoxian Yu
{"title":"Single-cell RNA-seq data clustering by deep information fusion.","authors":"Liangrui Ren, Jun Wang, Wei Li, Maozu Guo, Guoxian Yu","doi":"10.1093/bfgp/elad017","DOIUrl":null,"url":null,"abstract":"<p><p>Determining cell types by single-cell transcriptomics data is fundamental for downstream analysis. However, cell clustering and data imputation still face the computation challenges, due to the high dropout rate, sparsity and dimensionality of single-cell data. Although some deep learning based solutions have been proposed to handle these challenges, they still can not leverage gene attribute information and cell topology in a sensible way to explore the consistent clustering. In this paper, we present scDeepFC, a deep information fusion-based single-cell data clustering method for cell clustering and data imputation. Specifically, scDeepFC uses a deep auto-encoder (DAE) network and a deep graph convolution network to embed high-dimensional gene attribute information and high-order cell-cell topological information into different low-dimensional representations, and then fuses them to generate a more comprehensive and accurate consensus representation via a deep information fusion network. In addition, scDeepFC integrates the zero-inflated negative binomial (ZINB) into DAE to model the dropout events. By jointly optimizing the ZINB loss and cell graph reconstruction loss, scDeepFC generates a salient embedding representation for clustering cells and imputing missing data. Extensive experiments on real single-cell datasets prove that scDeepFC outperforms other popular single-cell analysis methods. Both the gene attribute and cell topology information can improve the cell clustering.</p>","PeriodicalId":55323,"journal":{"name":"Briefings in Functional Genomics","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in Functional Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bfgp/elad017","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Determining cell types by single-cell transcriptomics data is fundamental for downstream analysis. However, cell clustering and data imputation still face the computation challenges, due to the high dropout rate, sparsity and dimensionality of single-cell data. Although some deep learning based solutions have been proposed to handle these challenges, they still can not leverage gene attribute information and cell topology in a sensible way to explore the consistent clustering. In this paper, we present scDeepFC, a deep information fusion-based single-cell data clustering method for cell clustering and data imputation. Specifically, scDeepFC uses a deep auto-encoder (DAE) network and a deep graph convolution network to embed high-dimensional gene attribute information and high-order cell-cell topological information into different low-dimensional representations, and then fuses them to generate a more comprehensive and accurate consensus representation via a deep information fusion network. In addition, scDeepFC integrates the zero-inflated negative binomial (ZINB) into DAE to model the dropout events. By jointly optimizing the ZINB loss and cell graph reconstruction loss, scDeepFC generates a salient embedding representation for clustering cells and imputing missing data. Extensive experiments on real single-cell datasets prove that scDeepFC outperforms other popular single-cell analysis methods. Both the gene attribute and cell topology information can improve the cell clustering.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过深度信息融合对单细胞 RNA-seq 数据进行聚类。
通过单细胞转录组学数据确定细胞类型是下游分析的基础。然而,由于单细胞数据的高丢失率、稀疏性和维度性,细胞聚类和数据估算仍面临计算挑战。虽然已经提出了一些基于深度学习的解决方案来应对这些挑战,但它们仍然无法以合理的方式利用基因属性信息和细胞拓扑结构来探索一致性聚类。本文提出了一种基于深度信息融合的单细胞数据聚类方法--scDeepFC,用于细胞聚类和数据估算。具体来说,scDeepFC 利用深度自动编码器(DAE)网络和深度图卷积网络将高维基因属性信息和高阶细胞-细胞拓扑信息嵌入到不同的低维表征中,然后通过深度信息融合网络将它们融合生成更全面、更准确的共识表征。此外,scDeepFC 还将零膨胀负二项式(ZINB)集成到 DAE 中,以模拟辍学事件。通过联合优化 ZINB 损失和细胞图重建损失,scDeepFC 生成了用于细胞聚类和缺失数据补充的突出嵌入表示。在真实单细胞数据集上进行的大量实验证明,scDeepFC优于其他流行的单细胞分析方法。基因属性和细胞拓扑信息都能改进细胞聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Briefings in Functional Genomics
Briefings in Functional Genomics BIOTECHNOLOGY & APPLIED MICROBIOLOGY-GENETICS & HEREDITY
CiteScore
6.30
自引率
2.50%
发文量
37
审稿时长
6-12 weeks
期刊介绍: Briefings in Functional Genomics publishes high quality peer reviewed articles that focus on the use, development or exploitation of genomic approaches, and their application to all areas of biological research. As well as exploring thematic areas where these techniques and protocols are being used, articles review the impact that these approaches have had, or are likely to have, on their field. Subjects covered by the Journal include but are not restricted to: the identification and functional characterisation of coding and non-coding features in genomes, microarray technologies, gene expression profiling, next generation sequencing, pharmacogenomics, phenomics, SNP technologies, transgenic systems, mutation screens and genotyping. Articles range in scope and depth from the introductory level to specific details of protocols and analyses, encompassing bacterial, fungal, plant, animal and human data. The editorial board welcome the submission of review articles for publication. Essential criteria for the publication of papers is that they do not contain primary data, and that they are high quality, clearly written review articles which provide a balanced, highly informative and up to date perspective to researchers in the field of functional genomics.
期刊最新文献
Prioritization of candidate genes for major QTLs governing yield traits employing integrated multi-omics approach in rice (Oryza sativa L.). Environmental community transcriptomics: strategies and struggles. A review: simulation tools for genome-wide interaction studies. Beyond the hype: using AI, big data, wearable devices, and the internet of things for high-throughput livestock phenotyping. Enhancing novel isoform discovery: leveraging nanopore long-read sequencing and machine learning approaches.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1