{"title":"scSMD: a deep learning method for accurate clustering of single cells based on auto-encoder.","authors":"Xiaoxu Cui, Renkai Wu, Yinghao Liu, Peizhan Chen, Qing Chang, Pengchen Liang, Changyu He","doi":"10.1186/s12859-025-06047-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study explores the application of deep learning to single-cell data clustering, with a particular focus on managing sparse, high-dimensional data.</p><p><strong>Results: </strong>We propose the SMD deep learning model, which integrates nonlinear dimensionality reduction techniques with a porous dilated attention gate component. Built upon a convolutional autoencoder and informed by the negative binomial distribution, the SMD model efficiently captures essential cell clustering features and dynamically adjusts feature weights. Comprehensive evaluation on both public datasets and proprietary osteosarcoma data highlights the SMD model's efficacy in achieving precise classifications for single-cell data clustering, showcasing its potential for advanced transcriptomic analysis.</p><p><strong>Conclusion: </strong>This study underscores the potential of deep learning-specifically the SMD model-in advancing single-cell RNA sequencing data analysis. By integrating innovative computational techniques, the SMD model provides a powerful framework for unraveling cellular complexities, enhancing our understanding of biological processes, and elucidating disease mechanisms. The code is available from https://github.com/xiaoxuc/scSMD .</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"33"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11780796/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06047-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Single-cell RNA sequencing (scRNA-seq) has transformed biological research by offering new insights into cellular heterogeneity, developmental processes, and disease mechanisms. As scRNA-seq technology advances, its role in modern biology has become increasingly vital. This study explores the application of deep learning to single-cell data clustering, with a particular focus on managing sparse, high-dimensional data.
Results: We propose the SMD deep learning model, which integrates nonlinear dimensionality reduction techniques with a porous dilated attention gate component. Built upon a convolutional autoencoder and informed by the negative binomial distribution, the SMD model efficiently captures essential cell clustering features and dynamically adjusts feature weights. Comprehensive evaluation on both public datasets and proprietary osteosarcoma data highlights the SMD model's efficacy in achieving precise classifications for single-cell data clustering, showcasing its potential for advanced transcriptomic analysis.
Conclusion: This study underscores the potential of deep learning-specifically the SMD model-in advancing single-cell RNA sequencing data analysis. By integrating innovative computational techniques, the SMD model provides a powerful framework for unraveling cellular complexities, enhancing our understanding of biological processes, and elucidating disease mechanisms. The code is available from https://github.com/xiaoxuc/scSMD .
期刊介绍:
BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology.
BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.