Tian-Jing Qiao, Feng Li, Sha-Sha Yuan, Ling-Yun Dai, Juan Wang
{"title":"A Fusion Learning Model Based on Deep Learning for Single-Cell RNA Sequencing Data Clustering.","authors":"Tian-Jing Qiao, Feng Li, Sha-Sha Yuan, Ling-Yun Dai, Juan Wang","doi":"10.1089/cmb.2024.0512","DOIUrl":null,"url":null,"abstract":"<p><p>\n <b>Single-cell RNA sequencing (scRNA-seq) technology provides a means for studying biology from a cellular perspective. The fundamental goal of scRNA-seq data analysis is to discriminate single-cell types using unsupervised clustering. Few single-cell clustering algorithms have taken into account both deep and surface information, despite the recent slew of suggestions. Consequently, this article constructs a fusion learning framework based on deep learning, namely scGASI. For learning a clustering similarity matrix, scGASI integrates data affinity recovery and deep feature embedding in a unified scheme based on various top feature sets. Next, scGASI learns the low-dimensional latent representation underlying the data using a graph autoencoder to mine the hidden information residing in the data. To efficiently merge the surface information from raw area and the deeper potential information from underlying area, we then construct a fusion learning model based on self-expression. scGASI uses this fusion learning model to learn the similarity matrix of an individual feature set as well as the clustering similarity matrix of all feature sets. Lastly, gene marker identification, visualization, and clustering are accomplished using the clustering similarity matrix. Extensive verification on actual data sets demonstrates that scGASI outperforms many widely used clustering techniques in terms of clustering accuracy.</b>\n </p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"576-588"},"PeriodicalIF":1.4000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2024.0512","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/20 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing (scRNA-seq) technology provides a means for studying biology from a cellular perspective. The fundamental goal of scRNA-seq data analysis is to discriminate single-cell types using unsupervised clustering. Few single-cell clustering algorithms have taken into account both deep and surface information, despite the recent slew of suggestions. Consequently, this article constructs a fusion learning framework based on deep learning, namely scGASI. For learning a clustering similarity matrix, scGASI integrates data affinity recovery and deep feature embedding in a unified scheme based on various top feature sets. Next, scGASI learns the low-dimensional latent representation underlying the data using a graph autoencoder to mine the hidden information residing in the data. To efficiently merge the surface information from raw area and the deeper potential information from underlying area, we then construct a fusion learning model based on self-expression. scGASI uses this fusion learning model to learn the similarity matrix of an individual feature set as well as the clustering similarity matrix of all feature sets. Lastly, gene marker identification, visualization, and clustering are accomplished using the clustering similarity matrix. Extensive verification on actual data sets demonstrates that scGASI outperforms many widely used clustering techniques in terms of clustering accuracy.
期刊介绍:
Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics.
Journal of Computational Biology coverage includes:
-Genomics
-Mathematical modeling and simulation
-Distributed and parallel biological computing
-Designing biological databases
-Pattern matching and pattern detection
-Linking disparate databases and data
-New tools for computational biology
-Relational and object-oriented database technology for bioinformatics
-Biological expert system design and use
-Reasoning by analogy, hypothesis formation, and testing by machine
-Management of biological databases